A social network analysis of dreams

Table of Contents

Introduction

The interpretation of dreams is the royal road to a knowledge of the unconscious activities of the mind. –Sigmund Freud

One fun fact about me is that I always remember my dreams when I wake up. Since 2019, I have been keeping a dream journal, writing down the narratives that took place during my sleep before they escaped from memory. Though by no means comprehensive, this dream journal will offer some interesting insight into my unconscious and my real life.

Unlike the Freudian approach, I aim to understand my dreams in a quantitative way, using data analytic tools I learned through research over the past few years, including natural language processing, social network analysis, and data visualization.

Feel free to check the Github repository of this project for more information.

Data summary and workflow

The dream corpus used for this demonstration contains 100 dream entries entered between January and June 2019. Below is a glance at the file:

dream-example

As you can see, each dream narrative contains names, locations, and emotional states. In this demonstration, I will first create a social network for the character names in my dreams. Then, I will use sentiment analysis to explore how my emotional states in dreams have flucuated across time.

Social network analysis of dreams

1. Preparation

We will pre-process the dreams using the nltk package. The codes below are adapted from this amazing tutorial.

# import packages
from tkinter import N
import nltk #for extracting names from documents
from nltk import ne_chunk, pos_tag, word_tokenize
from nltk.tree import Tree
import numpy as np
import math
import itertools
import networkx as nx #for social network analysis
import matplotlib.pyplot as plt #for plotting graph
import collections
import csv #for reading corpus

#reading dream corpus
with open(filename, 'r') as csvfile:
    # creating a csv reader object
    csvreader = csv.reader(csvfile)
    # extracting field names through first row
    fields = next(csvreader)
    # extracting each data row one by one
    for row in csvreader:
        dreams.append(row)
    # get total number of rows
    print("Total no. of rows: %d"%(csvreader.line_num))

2. Extracting names from dream corpus

Next, we extract all character names that appeared in the dream corpus and create an adjacency matrix for all possible combinations of names.

#extract names
#create empty list holding names
name_list = list()
dream_name_list_all = list()

# Extract 
for dream in dreams:
    # parsing each dream document
    nltk_results = ne_chunk(pos_tag(word_tokenize(dream[0])))
    dream_name_list = []
    #for each dream in the corpus,
    #create list containing all names in that dream
    for nltk_result in nltk_results:
        if type(nltk_result) == Tree:
            name = ''
            for nltk_result_leaf in nltk_result.leaves():
                name += nltk_result_leaf[0] + ' '
            #print the content of each dream
            print ('Type: ', nltk_result.label(), 'Name: ', name)
            #add character names to the list of all names
            if nltk_result.label() == 'PERSON':
                if name not in dream_name_list:
                    dream_name_list.append(name)
                if name not in name_list:
                    name_list.append(name)
    #add list of names in each dream
    # to the list of names across all dreams
    dream_name_list_all.append(dream_name_list)

#create matrix for all name pairs
graph = np.zeros((len(name_list), len(name_list)))

Now, we should have an adjacency matrix called graph containing all possible combinations of names. The matrix consists of all 0s

Note that we can also examine the extracted names of nltk using .label(). nltk can distinguish between person names (“PERSON”) and geo-political entities (“GPE”) like city names. Below are the extractions from two dreams:

3. Creating social network ties

We will now update the values in the adjacency matrix by going through each dream document in the corpus to determine how many times each pair of names have co-occurred. Two people that appeared in the same dream are considered “connected” in the dream social network. The more two people have co-occurred, the higher their corresponding value in the matrix.

#for each dream's name list, code each pair
for dream_name_list in dream_name_list_all:
    for name_pair in itertools.combinations(dream_name_list, 2):
        graph[name_list.index(name_pair[0])][name_list.index(name_pair[1])] += 1

4. Plot social network graph

We can use the networkX and matplotlib packages to plot dream social network by converting the adjacency matrix into a graph.

First, let’s visualize how many times each character has occured across all dreams, that is, the distribution of the degree centrality of each node. As the first figure shows, degree centrality appears to follow a power-law distribution, as in many real-life social networks.

The second figure shows the degree centrality of the 15 most “popular” characters. Take a guess at who they are–yes, they are among my closest friends and people I know from work.

#create and plot graph using neworkX
nxgraph = nx.from_numpy_matrix(graph)

#rename nodes with character names
mapping = dict(zip(nxgraph, name_list))
nxgraph = nx.relabel_nodes(nxgraph, mapping)
d = dict(nxgraph.degree)

#plot the distribution of degrees for first 15 names
#sort dictionary
d_sorted = sorted(d.items(), key=lambda kv: kv[1], reverse = True)
d_sorted = collections.OrderedDict(d_sorted)
keys = d_sorted.keys()
values = d_sorted.values()
plt.bar(list(keys)[0:14], list(values)[0:14], color = get_color_gradient(color1, color2, 15))
plt.xticks(rotation = "vertical", size = 7)
plt.title("Degree centrality of first 15 characters in dreams")
plt.show()

#plot histogram for degree distribution
plt.hist(list(values), color="#F3A0F2",  bins=20)
plt.title("Distribution of degree centrality across all characters")
plt.show()

Next, let’s visualize the network graph. Here, I set the size of each node to scale with degree centrality. The node names are not shown here for privacy.

#show network plot
#set node size by degree centrality
pos = nx.spring_layout(nxgraph, scale=20, k=3/np.sqrt(nxgraph.order()))
nx.draw(nxgraph, pos, with_labels=False, 
    node_size=[math.sqrt(v+1) * 80 for v in d.values()],
    node_color = "#6dc9c0",
    edge_color = "#c5c7c5")
#set node color by centrality
plt.show()

It is clear that the graph has one large component and several smaller ones and is similar to real-life social networks. To more quantitatively examine the nature of the network, we can use the networkX package to calculate network metrics, including density, transitivity, and assortativity.

#calculate and print network metrics:
density = nx.density(nxgraph)
transitivity = nx.transitivity(nxgraph)
assortativity = nx.degree_assortativity_coefficient(nxgraph)
print('\nMy social network is characterized by the following metrics:')
print('density: ', "{:.3f}".format(density))
print('transitivity: ', "{:.3f}".format(transitivity))
print('degree assortativity', "{:.3f}".format(assortativity))


Resources

Below are some resources I found useful while exploring my dream data:

Dreambank

Dreambank is a database for over 20,000 dream reports. Researchers have used it to answer many interesting questions about dreams, cognition, and consciousness, such as shifts in dream content across age and the structures of dream vs. real-life social networks (Han et al., 2016).

Natural language processing

Using the nltk package to extract information from natural languages. Link

Sentiment analysis in python. Link

DeepL is a useful translator I used to translate my dreams from Chinese to English. Link