Syracuse University · Data Lab

🖼️ Flickr

A network data set crawled from Flickr. Both the contact network and selected group membership information are included.

81K
Nodes
5.9M
Edges
73.3
Avg Degree
No
Missing
Network Statistics
81K
Total Nodes
5.9M
Total Edges
73.3
Avg Degree
Photo
Category
Size Relative to Repository Maximum
Nodes
81K
Edges
5.9M
Nodes & Edges — Repository Comparison
Highlighted bar = this dataset. Logarithmic scale.
Edge-to-Node Ratio
Network density indicator
Dataset Details

Source

Lei Tang*, Huan Liu*

Dataset Information

4 files are included:

1. nodes.csv
-- it's the file of all the users. This file works as a dictionary of all the users in this data set. It's useful for fast reference. It contains
all the node ids used in the dataset

2. groups.csv
-- it's the file of all the groups. It contains all the group ids used in the dataset

3. edges.csv
-- this is the friendship network among the bloggers. The blogger's friends are represented using edges.
Since the network is symmetric, each edge is represented only once. Here is an example.

1,2

This means blogger with id "1" is friend with blogger id "2".

4. group-edges.csv
-- the user-group membership. In each line, the first entry represents user, and the 2nd entry is the group index.

If you need to know more details, please check the relevant papers and code:

Attribute Information

This is the data set crawled from Flickr ( http://www.Flickr.com ). Flickr is an image hosting and video hosting website, web services suite, and online community.
This contains the friendship network crawled and group memberships. For easier understanding, all the contents are organized in CSV file format.

-. Basic statistics
Number of users : 80,513
Number of friendship pairs: 5,899,882
Number of groups: 195

Relevant Papers

1. Lei Tang and Huan Liu. Relational Learning via Latent Social Dimensions. In Proceedings of The 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’09), Pages 817–826, 2009.

2. Lei Tang and Huan Liu. Scalable Learning of Collective Behavior based on Sparse Social Dimensions. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09), 2009.
How to Cite
If you publish material based on data from this repository, please acknowledge the Data Lab Social Computing Data Repository at Syracuse University in your acknowledgements. This helps others find and replicate your work.

APA Format

R. Zafarani and H. Liu. (2026). Social Computing Data Repository [https://datasets.syr.edu]. Data Lab, Syracuse University.
@misc{Data Lab:SU,
  author       = {R. Zafarani and H. Liu},
  year         = {2026},
  title        = {Social Computing Data Repository},
  url          = {https://datasets.syr.edu},
  institution  = {Data Lab, Syracuse University}
}