Derived networks analysis

Derived networks analysis

Collaboration

select/read WAc network
Network/Create Vector/Centrality/Degree/Output
Network/2-Mode Network/Partition into 2 Modes
Operations/Vector+Partition/Extract Subvector [1] = V1
Vector/Create Constant Vector [5695,1] = V2
select V1 as Second vector
Vectors/Max(First,Second)
Vector/Transform/Invert
Operations/Network+Vector/Vector#Network/Output = N
select V1 as First vector
select V2 as Second vector
Vectors/Subtract (First-Second)
Vectors/Max(First,Second)
Vector/Transform/Invert
select/read WAc network
Operations/Network+Vector/Vector#Network/Output = N'
select network N
Network/2-Mode/Transpose 2-Mode
select N' as Second network
Networks/Multiply Networks [Yes]
Network/Create New Network/Transform/Remove/Loops
Network/Create New Network/Transform/Arcs -> Edges/Biderected Only/Sum
File/Network/Change Label [Ct'(WAc)]

Ct'(WAc) n = 13376)   m = 4081577    AveDegree = 610.28364234

pS cores

      Rank    Vertex                       Value   Id
--------------------------------------------------------
         1       218                     15.8333   BORGATTI_S
         2       192                     15.8333   EVERETT_M
         3       444                      7.6667   FERLIGOJ_A
         4       440                      7.6667   BATAGELJ_V
         5      1603                      7.6667   MRVAR_A
         6       164                      7.6667   DOREIAN_P
         7      3379                      6.4333   STEINLEY_D
         8      3378                      6.4333   BRUSCO_M
         9       989                      6.3333   YANG_J
        10      3790                      6.3333   LESKOVEC_J
        11      3910                      6.0000   LANCICHI_A
        12      3034                      6.0000   FORTUNAT_S
        13      7240                      5.3333   QIAN_X
        14       190                      5.3333   WANG_Y
        15      8053                      5.0000   HERO_A
        16      6838                      5.0000   AMELIO_A
        17      5458                      5.0000   BAJEC_M
        18      5328                      5.0000   SUBELJ_L
        19       308                      5.0000   CHEN_P
        20      4558                      5.0000   PIZZUTI_C
        21      2779                      5.0000   REICHARD_J
        22      2262                      5.0000   BORNHOLD_S
        23      3911                      4.8333   SALES-PA_M
        24      2625                      4.8333   GUIMERA_R
        25      3866                      4.5833   NUSSINOV_Z
        26      4356                      4.5833   RONHOVDE_P
        27      3648                      4.3333   ROSVALL_M
        28      4446                      4.3333   BERGSTRO_C
        29       608                      4.3333   WILSON_R
        30      2346                      4.3333   HANCOCK_E

Citations among authors

read network CiteC 
transform CiteC to bipartite
read network WAc
transpose WAc  = AWc
select AWc as second
multiply networks
select WAc as second
multiply networks one-mode

Normalized citations among authors

April 29, 2018

C:\Users\batagelj\work\Python\WoS\BM

I computed the derived normalized network of citations among authors t(n(WAc)) * n(CiteC) * n(WAc). Every work has 1 point. They are distributed on arcs of the derived network. We first remove loops.

Let's first look at the largest weighted input degrees - the most cited authors:

  Rank  Vertex      Value   Id                 Rank  Vertex      Value   Id
---------------------------------            ---------------------------------
     1    1072   329.8886   NEWMAN_M             51     532    13.6097   DIETERIC_J
     2    3034   155.4974   FORTUNAT_S           52    1957    13.3569   MEADE_B
     3    2397    80.8228   GIRVAN_M             53    3189    13.2003   BLEI_D
     4    1740    51.6716   BARABASI_A           54     562    13.1853   MACQUEEN_J
     5      88    45.1972   BURT_R               55    2779    12.8176   REICHARD_J
     6    2263    42.5944   ALBERT_R             56     626    12.7497   SAVAGE_J
     7    2631    39.6466   ZACHARY_W            57     444    12.5424   FERLIGOJ_A
     8    3910    38.8163   LANCICHI_A           58     538    12.5064   LANGER_J
     9    2991    38.1660   CLAUSET_A            59     725    12.2316   KNOPOFF_L
    10    3185    31.8938   SCHAEFFE_S           60    4332    12.2182   GREGORY_S
    11     826    31.7021   STROGATZ_S           61      42    12.0322   ARABIE_P
    12      89    30.9933   FREEMAN_L            62     275    11.9246   BURRIDGE_R
    13     145    29.1247   WASSERMA_S           63    2389    11.6503   NG_A
    14    4554    29.0661   MOORE_C              64    3749    11.6379   JORDAN_M
    15     168    26.1896   FAUST_K              65    2262    11.4995   BORNHOLD_S
    16    2146    24.8884   WATTS_D              66      49    11.4380   HARARY_F
    17      38    24.7421   WHITE_H              67     676    11.2859   SNIJDERS_T
    18     480    24.5679   NEWMARK_N            68    3030    11.2303   DANON_L
    19    3208    23.8077   BLONDEL_V            69     249    11.0762   JOHNSON_D
    20     440    23.0214   BATAGELJ_V           70      55    11.0430   LORRAIN_F
    21    3872    22.6844   LAMBIOTT_R           71     578    10.7718   TANG_C
    22    2339    22.5521   VANDONGE_S           72    1201    10.6267   NOWICKI_K
    23     824    20.9136   ARENAS_A             73    2334    10.4957   BRANDES_U
    24    3790    19.8478   LESKOVEC_J           74      51    10.4213   HOLLAND_P
    25     241    19.8113   SHI_J                75    3615    10.2568   KUMARA_S
    26   11535    19.7797   MALIK_J              76    3614    10.2302   RAGHAVAN_U
    27    3648    19.7317   ROSVALL_M            77    2623    10.2084   FIEDLER_M
    28    4203    19.2631   VONLUXBU_U           78     134    10.1968   GAREY_M
    29    4446    19.1634   BERGSTRO_C           79    3031    10.1012   DIAZ-GUI_A
    30    2998    19.1422   BARTHELE_M           80     143    10.0321   LEINHARD_S
    31    3979    18.6968   LEFEBVRE_E           81     137     9.9847   SCOTT_J
    32    3886    18.6552   GUILLAUM_J           82     274     9.8897   BAK_P
    33     164    18.6261   DOREIAN_P            83     272     9.4798   SCHOLZ_C
    34    2775    18.3258   KLEINBER_J           84    3032     9.4052   BAGROW_J
    35      40    18.1618   BREIGER_R            85    2104     9.3647   RAND_W
    36     844    17.4888   VICSEK_T             86    2627     9.2402   LUSSEAU_D
    37     218    17.4204   BORGATTI_S           87     941     9.2348   GOLDBERG_D
    38    3036    16.9268   PALLA_G              88    1750     9.1468   KERTESZ_J
    39     919    16.8126   OKADA_Y              89    1984     9.0590   RENYI_A
    40      39    16.7620   BOORMAN_S            90     560     9.0590   ERDOS_P
    41    2336    15.8376   CHUNG_F              91    3035     8.9879   LATAPY_M
    42    2625    15.8216   GUIMERA_R            92     989     8.8951   YANG_J
    43    2629    15.7187   RADICCHI_F           93      83     8.8018   WARD_J
    44     276    14.9995   CARLSON_J            94    4558     8.7767   PIZZUTI_C
    45     192    14.9914   EVERETT_M            95    2266     8.7543   JEONG_H
    46    2990    14.6212   DUCH_J               96     252     8.5407   LIN_S
    47    2829    14.5231   AMARAL_L             97    1056     8.5407   KERNIGHA_B
    48      47    14.4554   GRANOVET_M           98     429     8.5187   CHRISTEN_K
    49    3412    13.7216   DERENYI_I            99    6786     8.4475   OLTVAI_Z
    50    3259    13.7216   FARKAS_I            100      57     8.3670   MILGRAM_S
 ---------------------------------         ---------------------------------

July 22, 2018

http://vladowiki.fmf.uni-lj.si/doku.php?id=pro:bm2
https://github.com/bavla/biblio/tree/master/Pajek/macro
https://raw.githubusercontent.com/bavla/biblio/master/Pajek/macro/norm1.mcr
https://raw.githubusercontent.com/bavla/biblio/master/Pajek/macro/norm2.mcr
C:\Programi\Pajek\macro\biblio

read CiteC
play norm1 [5695]
convert to 2-mode
read WAc
play norm2 [5695]
transpose 2-mode
select n(CiteC) as Second
multiply
select n(WAc) as Second
multiply [yes]
remove loops
weighted indegees

10. Deleted loops in N9 (13376)
==================================================
Lowest value of line:               0.00010254
Highest value of line:              2.51388889

> 1/2.52
[1] 0.3968254

nACiA weights are similarities, s ∈ [ ∞, 0 ]. To convert them to distances d we can use different transformations. For example

d = s_max / s - 1 ∈ [ 0, ∞ ]
d = 1 - s / s_max ∈ [ 0, 1 ]

We selected the second option.

Network/Create hierarchy/Clustering RC/Run [maximum, leader]

C:\Users\batagelj\work\Python\WoS\BM\results\Acite

> wdir <- "C:/Users/batagelj/work/Python/WoS/BM/results/Acite"
> setwd(wdir)
> source("https://raw.githubusercontent.com/bavla/biblio/master/Pajek/R/readCluRC.R")
> RM <- readCluRC("MaxLeader.clu")   
> n <- RM$n; nm <- n-1; np <- n+1
> n
[1] 13376
> HM <- read.csv("MaxLeaderHeig.vec",header=FALSE,skip=np)[[1]]
> RM$height <- HM
> RM$method <- "Maximum/Tolerant"
> RM$dist.method <- "nACiA"
> class(RM) <- "hclust"
> RM$call <- "Pajek.data"
> size <- read.csv("MaxLeaderSize.vec",header=FALSE,skip=np)[[1]]
> RM$labels <- read.csv("nACIA.net",header=FALSE,skip=1,sep="",colClasses="character",nrows=n)$V2
> length(size)
[1] 13375

We determine the partition of units into clusters of size at most 50.

select size vector as First
Networ/Create hierarchy/Make partition/with threshold determined by vector [50]
save partition as cut50.clu

Since Pajek hasn't an option to select among them only those with the size at least 20 we do this in R:

> clu <- read.csv("cut50.clu",header=FALSE,skip=1)[[1]]
> S <- table(clu)
> length(S)
[1] 258 
> table(S)
S
   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17 
  45   39   25   13   20   12    6    8    5    4    5    1    3    4    4    1 
  18   19   20   21   22   23   24   25   26   27   29   31   32   33   35   38 
   3    1    2    1    1    3    1    2    3    1    3    2    1    1    2    2 
  39   40   44   45   46   47   48   50 9961 
   1    1    1    1    2    1    1   25    1 
> P <- integer(n)
> for(i in 1:n) if(S[clu[i]+1]>19) P[i] <- clu[i] else P[i] <- 0
> table(P)
P
    0     1     3     4     7     8     9    10    11    12    13    14    15 
11080    50    29    50    31    50    35    47    50    46    50    50    50 
   17    20    21    23    24    26    27    29    31    32    35    36    39 
   50    50    50    26    50    50    23    50    50    46    50    50    50 
   40    41    43    44    45    49    53    56    58    60    61    62    67 
   26    29    25    33    45    35    25    38    39    50    50    23    44 
   68    69    70    75    77    79    81    87    92    93    95    96   106 
   38    48    29    50    50    26    50    23    50    31    20    50    27 
  127   134   139   145   161   164   226 
   20    50    24    21    40    32    22 
> T <- table(P); length(T)
[1] 59
> out <- file("cut50-20.clu","w"); cat("*vertices 13376",P,sep="\n",file=out); close(out)

The reduced partition is saved on the file cut50-20.clu and read into Pajek. It still contains 58 clusters. We extracted the corresponding subnetworks of citations among authors for visual inspection, cut50-20.net. Most of them are (double) star like formed around prominent scientists: Albert R + Barabasi A, Bergstro C + Rosvall M, Bezdek J, Blei D, Blondel V, Bonacich P + Kleinberg J, Breiger R, Burt R + Doreian P, Chung F + Von Luxbu U, Clauset A, Dietric J + Maede B, Fortunato S, Freeman L, Ghosh J, Girvan M, Goldberg D, Jaccard P, Jain A, Johnson D, Jordan M, Kaufman L, Knuth D, Leskovec J, Mac Queen J, Newman M, Newmark N, Okada Y, Palla G + Viscek T, Prescott W, Schaeffe S, Scott J, Sporus O, Stein C, Strehl A, Strogatz S, Van Donge S, and some “cliques” of co-authors with attachments. We visually selected 12 clusters (Adamic L, Batagelj V + Ferligoj A, Bollobas B, Burt R + Doreian P, Faust K + Watts D, Fiedler M + Harary F, Granovet M, Mizruchi M, Murtagh F, Nowicki K + Wasserman S, Robins G, Ward J, White H + Zachary W) with more interesting network structure for detailed inspection, cut50select.net. The separate subnetworks are saved as cut50_xy.net.

Most of the subnetworks of clusters for the Leader strategy have almost acyclic structure. This has to be considered also in their visualization. We identified some interesting clusters: {21,8,15,13,1,20,10,39,70,43,23,14}. Because of limited space we present here only subnetworks induced by four among the selected clusters.

C:\Users\batagelj\work\Python\WoS\BM\results\Acite