Project 2

Select a network data set from the following list or from somewhere else (sources, your own network) or the k-th time slice from Reuters terror news where k is your number from the students' list. The selected network should have at least 500 labeled nodes. To prevent duplication, send a reservation to my e-mail.

If the selected network is not in Pajek's format you need first to convert it (example).

For the selected network, using Pajek:

  1. determine basic network characteristics (directed/undirected, loops, multiple links; weights?; the number of nodes, number of links, number of components; largest degree, diameter, acyclic?, bow-tie composition for directed, …).
  2. draw the degree (in directed also indegree and outdegree) distribution. List the top 20 nodes of the largest (in/out) degree.
  3. in a directed network:
    1. number of strong components; if many, their size distribution;
    2. condensation; depth of condensation
  4. in an undirected network the largest (weak) component / in a directed network extract the largest strong component. For it compute the standard importance measures (degree, betweenness, closeness, corrected clustering coefficient; and in a directed network also hubs and authorities). For each measure determine the top 20 nodes.
  5. determine the cores in your network. Extract and draw the largest core with at most 100 nodes.
  6. determine some interesting link islands of your network, draw and comment on them. If your network is not weighted select some measure of the importance of links (hints) and compute the weights. Interpret the results.

Write a report. Attach also a ZIP with your network data.

