In the previous post, I presented an example of an academic article database to explain the process of analyzing database requirements. Today, we'll take a more practical approach and explore this example by creating an academic paper database in PostgreSQL and a citation graph using Apache AGE. The queries for building this database are available in this GitHub gist.
1. What are graphs?
A graph G = (V, E) is a data structure composed of a set of vertices (V) and a set of edges (E) (Vilas Boas, 2016). We can represent a graph as shown in Figure 1. The letters from 'a' to 'e' represent the vertices, and the edges are the lines that connect them.
In the example of our database, each vertex represents an academic paper. The relationship between these papers is "cited by" since an article references (or cites) other articles. An edge represents a relationship.
2. Requirements
To follow this tutorial, you need to install PostgreSQL and Apache AGE. Below is a list of tutorials to install these programs:
- PostgreSQL with Apache AGE installation on Windows with WSL;
- Installing Apache AGE with Docker on Windows;
- PostgreSQL with Apache AGE installation on Ubuntu;
- Easy Installation of PostgreSQL, AGE & Age-viewer on Mac.
In this tutorial, I created the database using PostgreSQL 13 and Apache AGE 1.3.0. At the time of this post, Apache AGE supports only PostgreSQL 11, 12, and 13. The operating system used is Ubuntu 22.04 LTS.
3. Building the database
Before we begin building the database, let's list some properties of scientific papers:
- Title;
- Author(s);
- Year of publication;
- Publication venue.
As defined in Section 1, the relationship between two papers can be defined as "cited by" or "cites" for simplicity. Since one paper cites another, we can represent this relationship as shown in Figure 2.
Now we can create the database. Assuming that PostgreSQL has just been installed, the following commands will start the PostgreSQL service and call the command-line interface:
sudo service postgresql start
sudo -u postgres psql
Once in the interface, we create a database named papersdb
and connect to it.
create database papersdb;
\c papersdb
To use the Apache AGE extension, it is necessary to install it on the server and load it for each session. And to simplify queries, we add ag_catalog
to the search_path
.
CREATE EXTENSION age;
LOAD 'age';
SET search_path = ag_catalog, "$user", public;
Next, we create a graph called citation_graph
:
SELECT * FROM create_graph('citation_graph');
We now create some entries for the academic papers, which contain the properties defined at the beginning of this section. We will also define an ID to make it easier to connect the edges between each paper. The id
structure is according to the last name of the first author, publication year, and the beginning of the publication title, as in the example oliveira2009automatic
.
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'williams2015affect', title: 'Investigating affect in algorithmic composition systems', publisher: 'Psychology of Music',year: 2015, author:['Duncan Williams', 'Alexis Kirke', 'Eduardo R Miranda', 'Etienne Roesch', 'Ian Daly', 'Slawomir Nasuto']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'williams2013towards', title: 'Towards affective algorithmic composition', publisher: 'University of Jyväskylä, Department of Music', year: 2013, author:['Duncan Williams', 'Alexis Kirke', 'Eduardo R Miranda', 'Etienne Roesch', 'Slawomir Nasuto']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'schubert1999measurement', title: 'Measurement and Time Series Analysis of Emotion in Music', publisher: 'University of New South Wales. Music & Music Education', year: 1999, author:['Emery Schubert']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'scherer2004which', title: 'Which Emotions Can be Induced by Music? What Are the Underlying Mechanisms? And How Can We Measure Them?', publisher: 'Journal of New Music Research', year: 2004, author:['Klaus R. Scherer']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'gabrielsson2001emotion', title: 'Emotion perceived and emotion felt: Same or different?', publisher: 'Musicae Scientiae', year: 2001, author:['Alf Gabrielsson']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'scherer2001emotional', title: 'Emotional effects of music: Production rules', publisher: 'Oxford University Press', year: 2001, author:['Klaus Scherer', 'Marcel Zentner']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'juslin2001communicating', title: 'Communicating emotion in music performance: A review and a theoretical framework', publisher: 'Oxford University Press', year: 2001, author:['Patrik Juslin']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'williams2015dynamic', title: 'Dynamic Game Soundtrack Generation in Response to a Continuously Varying Emotional Trajectory', publisher: 'Audio Engineering Society Conference: 56th International Conference: Audio for Games', year: 2015, author:['Duncan Williams', 'Alexis Kirke', 'Joel Eaton', 'Eduardo Miranda', 'Ian Daly', 'James Hallowell', 'Etienne Roesch', 'Faustina Hwang', 'Slawomir Nasuto']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article {id: 'williams2015investigating', title: 'Investigating Perceived Emotional Correlates of Rhythmic Density in Algorithmic Music Composition', publisher: 'Association for Computing Machinery', year: 2015, author:['Duncan Williams', 'Alexis Kirke', 'Joel Eaton', 'Eduardo Miranda', 'Ian Daly', 'James Hallowell', 'James Weaver', 'Asad Malik', 'Etienne Roesch', 'Faustina Hwang', 'Slawomir Nasuto']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'daly2015towards', title: 'Towards human-computer music interaction: Evaluation of an affectively-driven music generator via galvanic skin response measures', publisher: '2015 7th Computer Science and Electronic Engineering Conference (CEEC)', year: 2015, author:['Duncan Williams', 'Alexis Kirke', 'Eduardo Miranda', 'Ian Daly', 'Faustina Hwang', 'Slawomir Nasuto', 'Asad Malik', 'James Weaver']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'kirke2013artificial', title: 'Artificial affective listening towards a machine learning tool for sound-based emotion therapy and control', publisher: 'Proceedings of the Sound and Music Computing Conference', year: 2013, author:['Alexis Kirke', 'Eduardo Miranda', 'Slawomir Nasuto']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'kirke2012learningto', title: 'Learning to Make Feelings: Expressive Performance as a part of a machine learning tool for sound-based emotion therapy and control', publisher: 'the 9th Intl Symp on Computer Music Modeling and Retrieval', year: 2012, author:['Alexis Kirke', 'Eduardo Miranda', 'Slawomir Nasuto']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'lopez2010real', title: 'Real-Time Emotion-Driven Music Engine', publisher: 'the 9th Intl Symp on Computer Music Modeling and Retrieval', year: 2010, author:['Alex Lopez', 'Antonio Oliveira', 'Amilcar Cardoso']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'oliveira2008affective', title: 'Affective-driven music production: selection and transformation of music', publisher: 'ARTECH', year: 2008, author:['Antonio Oliveira', 'Amilcar Cardoso']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'oliveira2008modeling', title: 'Modeling affective content of music: A knowledge base approach', publisher: 'Sound and Music Computing Conference', year: 2008, author:['Antonio Oliveira', 'Amilcar Cardoso']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'livingstone2007controlling', title: 'Controlling musical emotionality: an affective computational architecture for influencing musical emotions', publisher: 'Digital Creativity', year: 2007, author:['Steven R. Livingstone', 'Ralf Mühlberger', 'Andrew Brown', 'Andrew Loch']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'livingstone2005dynamic', title: 'Dynamic Response: Real-Time Adaptation for Music Emotion', publisher: 'Creativity & Cognition Studios Press', year: 2005, author:['Steven R. Livingstone','Andrew Brown']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'oliveira2009automatic', title: 'Automatic manipulation of music to express desired emotions', publisher: 'Proceedings of the 6th Sound and Music Computing Conference', year: 2009, author:['Antonio Oliveira', 'Amilcar Cardoso']})
RETURN v $$) as (v agtype);
SELECT * FROM cypher('citation_graph', $$
CREATE (v:Article { id: 'russell1980circumplex', title: 'A circumplex model of affect', publisher: 'American Psychological Association', year: 1980, author:['James Russel']})
RETURN v $$) as (v agtype);
Finally, we define the relationships between articles based on the citations found within each one. We can use the id
to retrieve the vertices and connect them:
SELECT * FROM cypher ('citation_graph', $$
MATCH (a:Article), (b:Article)
WHERE a.id = 'williams2015affect' and b.id = 'williams2013towards'
CREATE (a)-[e:CITES]->(b)
RETURN e
$$) as (e agtype);
To avoid making this post too long, I will not repeat all the queries at this step. Unlike the entries with the information of the papers, which is only one for each paper, the relationships can be "one to many." This is the link to the GitHub gist containing all the queries to create the relationships between the papers. You can copy and paste on your PostgreSQL CLI to run the queries. With the definition of all edges, we have finished building our citation graph, presented in Figure 3:
4. Conclusions
In this article, we have learned what graphs are and how to build a graph database using the PostgreSQL extension Apache AGE. In the next post, I will show you how to use the AGE Viewer to display the graph illustrated in Figure 3.
5. Errata
My intention is to provide access to technology information through reliable sources. If you have found any incorrect information, please let your contribution in the comments below 😊.
6. Related Articles
Collaborate to Innovate: How a Social Network Can Help You Find Your Dream Pair Programming Team
Create a Word Guessing Game With Apache AGE
Step-by-Step Guide to Install PostgreSQL via Package Management for Apache AGE on Ubuntu
Easy guide to install and configure PostgreSQL with Apache AGE on Windows
Installing Apache AGE with Docker on Windows
Easy Installation of PostgreSQL, AGE & Age-viewer on Mac
7. References
Vilas Boas, Matheus Guedes. Graph Theory Class Notes - Basic Concepts on Graphs - Definitions and Theorems. Federal University of Ouro Preto. 2016. Unpublished.
Banner designed by Freepik.
8. Contribute to Apache AGE
Apache AGE website: https://age.apache.org/
Apache AGE Github: https://github.com/apache/age
Top comments (0)