Basics directed network

Objectives

This is an introduction to anylisng directed networks in R using the packages sna and network. We will revisit the adjacency matrix and the notion of tie variables for directed networks.

We will learn the basic

  • differences between in- and out-degree
  • definitions of the basic subgraphs
    • dyads
    • triads

We will use a dataset of 73 high school pupils collected by James Coleman (1964) that comes with the package `sna.

For full details of the packages, see https://cran.r-project.org/web/packages/sna/sna.pdf and https://cran.r-project.org/web/packages/network/network.pdf.

Load network

Load sna and network

library(sna)
library(network)

The background of, and description, of the dataset is provided in the helpfile

?coleman

Now load the network

data(coleman)

Identifying objects and extracting adjacency matrix

What do I have in my workspace and what did I load?

The function data() loaded something. Use the general purpuse command ls() to list what is in your workspace

ls()
## [1] "coleman"

This is not one adjacency matrix but

class(coleman)
## [1] "array"

As described in the help file this is an array with 2 \(\times\) adjacency matrices of dimensions \(73 \times 73\)

dim(coleman)
## [1]  2 73 73

The individual slices are matrices

class(coleman[1,,])
## [1] "matrix"
class(coleman[2,,])
## [1] "matrix"

You can print coleman[1,,] and coleman[2,,] to screen to view the adjacency matrices (but that will just be a lot of ones and zeros, in particular \(2 \times 73 \times 73\) ones and zeros). You can visualise the adjacency matrices in these matrix plots

par( mfrow = c(1,2))
plot.sociomatrix( coleman[1,,] , drawlab=FALSE , drawlines = FALSE, xlab = 'FALL')
plot.sociomatrix( coleman[2,,] , drawlab=FALSE , drawlines = FALSE, xlab = 'SPRING')

In the adjacency matrix, rows record the ties sent, and columns record the ties received. Student 1 has the out-tie variables \(x_{1,2},x_{1,3},\ldots,x_{1,n}\)

 coleman[1,1,]
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
##  0  0  0  0  0  0  0  0  0  0  0  0  0  1  1  0  0  0  0  0  1  0  0  0  0  0 
## 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 
##  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
## 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 
##  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

For example, that coleman[1,1,14] is 1 means that student 1 has a tie to student 14, and that coleman[1,1,2] is 0 means that student 1 does not have a tie to student 2.

Outdegree

Taking the sum of the outties

 sum(coleman[1,1,])
## [1] 5

gives us student 1’s outdegree, \(d_i^o= \sum_{j,j\neq i}x_{i,j}\).

To get the outdegree of all pupils, sum across columns for all rows

 rowSums(coleman[1,,])
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
##  5  2  2  4  2  3  1  2  5  0  4  3  4  2  1  3  2  3  5  5  4  4  9  1  0  6 
## 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 
##  3  2  2  2  3  4  4  2  2  4  2  4  5  2  4  4  3  1  5  4  4  4  4  4  3  5 
## 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 
##  2  6  3  3  6  2  3  4  2  2  5  4  3  5  5  4  5  5  6  0  0

And, like for un-directed networks, we can chart the the degree distribution

 plot( table (rowSums(coleman[1,,]) ))

Indegree

The ties a student has received is the column of that particular student. Student 1 has received the tie variables \(x_{2,1},x_{3,1},\ldots,x_{n,1}\)

 coleman[1,,1]
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
##  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
## 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 
##  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
## 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 
##  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

For example, that coleman[1,14,1] is 0 means that student 1 has not received a tie from student 14.

To get the indegree of all pupils sum across rows for all columns

 colSums(coleman[1,,])
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
##  0  0  0  1  1  1  1  1  1  1  2  2  2  2  2  3  3  4  4  6 10 10  0  0  0  1 
## 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 
##  1  1  1  2  2  2  2  2  3  3  3  4  4  5  5  5  9  1  3  2  3  4  4  5  5  6 
## 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 
##  6  7  8  0  1  2  2  3  3  5  5  5  6  7  6  7  7  8 10  0  0

And, like for un-directed networks, we can chart the the indegree distribution

 plot( table (colSums(coleman[1,,]) ))

The number of ties sent and received are not the same!

For example, student 1 has nominated 5 other people but student 1 has not been nominated back at all. Plotting the outdegrees against the indegrees

 plot( jitter(rowSums(coleman[1,,]) ) , jitter( colSums(coleman[1,,]) ) ) 

reveal that some send more than they receive, and some send fewer than they receive.

One reason for this is that some pairs are assymetric and other pairs are symmetric in their choices (there are of course other possible explanations Igarashi and Koskinen 2020)

Dyads and reciprocity

A dyad is a pair of nodes and their ties \((X_{i,j}X_{j,i})\) to eachother. To investigate whether pairs are symmetric or assymetric, consider e.g. that 1 nominated 14 but 14 did not nominate 1 back - this pair is assymetric.

 coleman[1,1,14]
## [1] 1
 coleman[1,14,1]
## [1] 0

The dyad 19 and 4

 coleman[1,4,19]
## [1] 1
 coleman[1,19,4]
## [1] 1

is symmetric, or reciprocal - we call this a mutual dyad.

In general we can distinguish between dyads that are Mutual, Assymetric, and Null (Holland and Leinhardt 1972):

Fun Fact: Linear algebra

Recall that the inner product of a matrix \(X\) with itself \(X\) has entries \((XX)_{i,j}=\sum_{k}X_{i,k} X_{ k,j }\). Consequently, \(\mathrm{tr}(XX)\) gives you twice the number of reciprocated dyads. There is no standard command for trace in R, but you can take the sum() of the diagonal diag() of the matrix product, e.g. for adjacency matrix A, sum( diag( A %*% A )). The standard product * in R is the dot product \(X \odot X^{\top}\) where \((X \odot X^{\top})_{i,j}=X_{i,j} X_{ i,j }\). Consequently for adjacency matrix A, A * t(A) is the matrix of reciprochated ties.

Dyad census

The dyad census tabulates the number of dyads that are Mutual, Assymetric, and Null

 dyad.census(coleman[1,,])
##      Mut Asym Null
## [1,]  62  119 2447

Fun Fact: census is complete enumeration of dyads

Note that in the Coleman example, the total number of M, A, and N dyads sum up to

 62 + 119 +  2447
## [1] 2628

which is exactly equal to \(n(n-1)/2=73\times72/2=2628\) - this is the total number on (un-ordered) pairs, the number of cells in the upper diagonal of the adjacency matrix.

Triads

For tripplets in directed networks, there many different kinds. For open triads, we could for example consider the three different types you can achieve with 0 Mutual dyads, 2 Assymetric dyads, and 1 Null dyad.

The triples are labeled by the number of Mutual, Assymetric, and Null dyads - the so-called MAN labeling scheme. Here, the ‘C’, ‘D’, and ‘U’ distinguish between ‘Cyclic’, ‘Down’, and ‘Up’.

For closed triands consider the Transitive (‘T’), Cyclic (‘C’), and complete closed triads

When considering the potential interpretation of these closed triads, imagine scenarios where ties reflect ‘liking’, ‘look up to’, ‘give orders’, ‘passing on information’, and how these structures would be reflected in status, chain of command, access to information, etc.

Triad census

In total there are 16 types of triangles, all of which are labelled using the MAN labeling scheme (Holland and Leinhardt 1972). For the Coleman data, we calulate the triad census as

 triad.census(coleman[1,,])
##        003  012  102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210
## [1,] 50171 7384 3957   64  121  128  139   70   23    1  20   43   10    9  34
##      300
## [1,]  22

Fun Fact: census is complete enumeration of triads

Note that in the Coleman example, the total number triads

sum(triad.census(coleman[1,,]))
## [1] 62196

which is exactly equal to \[\binom{n}{3}= \frac{n(n-1)(n-1)}{(3\times 2)}=73\times72\times71/6=62196\] - this is the total number on (un-ordered) tripplets, the number of ways in which you can chose 3 element subsets out of a 73 element set.

Plotting the network

Plotting the sociogram of the network does not differ from the undirected case, with the exception that the ties are now represented by arrows.

plot( as.network( coleman[1,,] , directed= TRUE), # the network object
      vertex.cex = degree(coleman[1,,], cmode = 'indegree')/5 + .2 )

The size of a node here is set proportional to the indegree centrality.

There are two large ‘clusters’ of nodes that are connected within but not between - these are two components

References

Coleman, James Samuel. 1964. “Introduction to Mathematical Sociology.” Introduction to Mathematical Sociology.

Holland, P. W., and S. Leinhardt. 1972. “Local Structure in Social Networks.” In Sociological Methodology, edited by Heise D., 2:1–45. 5th Ser. San Francisco, CAn: Jossey-Bass.

Igarashi, Tasuku, and Johan Koskinen. 2020. “Overchoosing: A Mechanism of Tie-Formation in Social Networks.” PsyArXiv. https://psyarxiv.com/47q39.