Till KTH:s startsida Till KTH:s startsida

Project #09

9.pdf

Title: Clustering Wikipedia Articles using Graph Databases

Leader's Name: Robin Chowdhury
Member2 Name: Ludvig Hagberg
Member3 Name: Jacob Sievers
Member4 Name: Hanna Nyblom

...
Related paper: Bin Shao, Haixun Wang, Yanghua Xiao. Managing and
mining large graphs: systems and implementations.
Christos Faloutsos, U. Kang. Managing and Mining Large Graphs:
Patterns and Algorithms.
Presentation Day: May 25
Model: BS
Abstract: Wikipedia is a great source of information, but a lot of the
information contained has to be manually organized into categories and
in there after importance. We believe this can be done automatically
with the help of graphs and for that a graph database would be
suitable.

We want to make a visual representation of the links between articles
that uses clustering. To do this we will, in accordance with the
article, store the link data in a graph database using Neo4j. A graph
database is fitting for this purpose as it speeds up queries using
graphs and has a good scalability. This will also be an exercise in
clustering as this will be necessary for the visual representation.

p589-shao.pdf