Big Graph Data Sets

There are quite a few big graphs that are publicly available. Usually they are web graphs and social networks. Also thanks to the researchers for their hard work to collect and prepare these data sets.

Real-world Data Sets

General Graph Data Sets

RDF Graphs

RDF (Resource Description Framework) graph is one special kind of graph, with node and edge labeled (possibly multi-labeled). There are a lot of RDF data available, but the data quality can be a problem. To use RDF data as graph data, some transformation needs to be done beforehand (e.g., id mapping).

Synthetic Graph Generators

Graph generators are handy to play with. They have controllable behavior, so are good for system test and proof of concept. Graph generators come from different perspectives. Benchmarking is one of the most common usecase. Many generators make use of the R-MAT model [1] to generate really big graphs.


[1] Chakrabarti, Deepayan, Yiping Zhan, and Christos Faloutsos. "R-MAT: A recursive model for graph mining." Computer Science Department (2004): 541.

comments powered by Disqus