Important bibliographic subnetworks

March 2022

Construction of important bibliographic subnetworks from a collection of bibliographic networks obtained from WoS using WoS2Pajek.

SocNet 2018 networks

For illustration, we shall use the collection on network analysis (2018) paper, files.

Network # Nodes (sum) # Mode 1 # Mode 2 # Arcs
CiteN 1,297,133 2,753,633
CiteR 70,792 398,199
WAn 1,693,104 1,297,133 395,971 1,442,240
WAr 163,803 70,792 93,011 215,901
WKn 1,329,542 1,297,133 32,409 1,167,666
WKr 103,201 70,792 32,409 1,167,666
WJn 1,366,279 1,297,133 69,146 720,044
WJr 79,735 70,792 8943 61,741

Co - First Co-authorship network

Co = WArT * WAr

Co[a,b] = # works that authors a and b co-authored

Co[a,a] = # works of author a

Problem: Co is a sum of complete subgraphs spanned on co-authors of each work. Works with large number of co-authors blur the picture.

read WAr
info
  2-Mode Network: Rows=70792, Cols=93011
Network/Create vector/centrality/degree/input
Info vector [+200]
  chinese 
Network/2-mode/transpose
select WAr as second
Networks/multiply [yes]
Network/crate new/tranform/remove/loops [yes]
Network/crate new/tranform/arcs->edges/bidirected/min [no]
File/Network/change label [Co]
Network/Info/Line values
         Line Values            Frequency       Freq%      CumFreq  CumFreq%
----------------------------------------------------------------------------
 (            ...    1.0000]       251440     86.1772       251440   86.1772
 (     1.0000 ...    2.0000]        27072      9.2785       278512   95.4557
 (     2.0000 ...    3.0000]         7252      2.4855       285764   97.9412
 (     3.0000 ...    4.0000]         2895      0.9922       288659   98.9334
 (     4.0000 ...    5.0000]         1350      0.4627       290009   99.3961
 (     5.0000 ...    6.0000]          720      0.2468       290729   99.6429
 (     6.0000 ...    7.0000]          329      0.1128       291058   99.7556
 (     7.0000 ...    8.0000]          225      0.0771       291283   99.8327
 (     8.0000 ...    9.0000]          143      0.0490       291426   99.8818
 (     9.0000 ...   10.0000]           86      0.0295       291512   99.9112
 (    10.0000 ...   11.0000]           58      0.0199       291570   99.9311
 (    11.0000 ...   12.0000]           57      0.0195       291627   99.9506
 (    12.0000 ...   13.0000]           32      0.0110       291659   99.9616
 (    13.0000 ...   14.0000]           23      0.0079       291682   99.9695
 (    14.0000 ...   15.0000]           17      0.0058       291699   99.9753
 (    15.0000 ...   16.0000]            9      0.0031       291708   99.9784
 (    16.0000 ...   17.0000]            9      0.0031       291717   99.9815
 (    17.0000 ...   18.0000]            5      0.0017       291722   99.9832
 (    18.0000 ...   19.0000]            8      0.0027       291730   99.9859
 (    19.0000 ...   20.0000]            4      0.0014       291734   99.9873
 (    20.0000 ...   21.0000]            1      0.0003       291735   99.9877
 (    21.0000 ...   22.0000]           13      0.0045       291748   99.9921
 (    22.0000 ...   23.0000]            4      0.0014       291752   99.9935
 (    23.0000 ...   24.0000]            2      0.0007       291754   99.9942
 (    24.0000 ...   25.0000]            3      0.0010       291757   99.9952
 (    25.0000 ...   26.0000]            1      0.0003       291758   99.9955
 (    26.0000 ...   27.0000]            3      0.0010       291761   99.9966
 (    27.0000 ...   28.0000]            1      0.0003       291762   99.9969
 (    28.0000 ...   29.0000]            1      0.0003       291763   99.9973
 (    29.0000 ...   30.0000]            1      0.0003       291764   99.9976
 (    30.0000 ...   31.0000]            3      0.0010       291767   99.9986
 (    31.0000 ...   32.0000]            1      0.0003       291768   99.9990
 (    32.0000 ...   33.0000]            1      0.0003       291769   99.9993
 (    33.0000 ...   34.0000]            0      0.0000       291769   99.9993
 (    34.0000 ...   35.0000]            0      0.0000       291769   99.9993
 (    35.0000 ...   36.0000]            0      0.0000       291769   99.9993
 (    36.0000 ...   37.0000]            0      0.0000       291769   99.9993
 (    37.0000 ...   38.0000]            1      0.0003       291770   99.9997
 (    38.0000 ...   39.0000]            0      0.0000       291770   99.9997
 (    39.0000 ...   40.0000]            0      0.0000       291770   99.9997
 (    40.0000 ...   41.0000]            0      0.0000       291770   99.9997
 (    41.0000 ...   42.0000]            0      0.0000       291770   99.9997
 (    42.0000 ...   43.0000]            1      0.0003       291771  100.0000
----------------------------------------------------------------------------

The most active co-authors.

% edge cut at 10
Network/crate new/tranform/remove/lines with value/lower than [10][yes]
Network/create partition/degree/all
File/partition/change label [cut 10]
Operations/network+partition/extract/subnetwork [1-*]
Draw/network+first partition

Mostly small components. To get more structure we must lower the threshold.

Select Co
Network/create new/tranform/remove/lines with value/lower than [5][yes]
Network/create partition/components/weak [5]
Partition/canonical/decreasing
Info partition
Operations/network+partition/extract/subnetwork [1-*]
Draw/network+first partition 

Large (661 nodes) component of mainly Chinese authors. We exclude them.

Select canonical partition
Partition/binarize [2-*]
Select partition cut 10 as second
Partitions max
Select Co
Operations/network+partition/extract/subnetwork [1-*]
Network/create partition/components/weak [1]
Draw/network+first partition

Large component with the main authors.

Another option is to make a list (partition) of “interesting” authors (for example SNA) and extract the corresponding subnetwork from Co.

Ct' - Newman's strict co-authorship

The works with many co-authors are overrepresented (contribute more to the total weight) in the network Co. To make contributions equal we use a fractional approach - normalization.

Normalization of (2-mode) binary network N, n[w,a] ∈ {0,1} macros

n(N)[w,a] = n[w,a]/max(1,outdeg(w))

n'(N)[w,a] = n[w,a]/max(1,outdeg(w)-1)

Ct' = n(WAr)T * n'(WAr)

Select WAr
Macro/play/ norm2p [70792] -> n'(WAr)
Select WAr
Run macro norm2 [70792]
Network/2-mode/transpose
Select n'(WAr) as second
Networks/Multiply [yes]
Network/crate new/tranform/remove/loops [yes]
Network/crate new/tranform/arcs->edges/bidirected/sum [no]
File/Network/change label [Ct']
Network/Info/Line values [#25]
         Line Values           Frequency       Freq%      CumFreq  CumFreq%
---------------------------------------------------------------------------
 (            ...   0.0001]         7861      2.6942         7861    2.6942
 (     0.0001 ...   1.0348]       278858     95.5743       286719   98.2685
 (     1.0348 ...   2.0696]         3943      1.3514       290662   99.6199
 (     2.0696 ...   3.1043]          751      0.2574       291413   99.8773
 (     3.1043 ...   4.1390]          216      0.0740       291629   99.9513
 (     4.1390 ...   5.1737]           58      0.0199       291687   99.9712
 (     5.1737 ...   6.2084]           44      0.0151       291731   99.9863
 (     6.2084 ...   7.2432]           23      0.0079       291754   99.9942
 (     7.2432 ...   8.2779]            7      0.0024       291761   99.9966
 (     8.2779 ...   9.3126]            3      0.0010       291764   99.9976
 (     9.3126 ...  10.3473]            3      0.0010       291767   99.9986
 (    10.3473 ...  11.3820]            1      0.0003       291768   99.9990
 (    11.3820 ...  12.4167]            0      0.0000       291768   99.9990
 (    12.4167 ...  13.4515]            0      0.0000       291768   99.9990
 (    13.4515 ...  14.4862]            0      0.0000       291768   99.9990
 (    14.4862 ...  15.5209]            1      0.0003       291769   99.9993
 (    15.5209 ...  16.5556]            0      0.0000       291769   99.9993
 (    16.5556 ...  17.5903]            1      0.0003       291770   99.9997
 (    17.5903 ...  18.6250]            0      0.0000       291770   99.9997
 (    18.6250 ...  19.6597]            0      0.0000       291770   99.9997
 (    19.6597 ...  20.6945]            0      0.0000       291770   99.9997
 (    20.6945 ...  21.7292]            0      0.0000       291770   99.9997
 (    21.7292 ...  22.7639]            0      0.0000       291770   99.9997
 (    22.7639 ...  23.7986]            0      0.0000       291770   99.9997
 (    23.7986 ...  24.8333]            1      0.0003       291771  100.0000
---------------------------------------------------------------------------

The threshold is now much lower.

Ct' edge cut 4 
Ct' edge cut 2
cut 2 components >= 5
union
Ct' extract union 
Partition weak
Draw

Partition for extraction of selected components.

select union
Partition binarize [1-*]
Operations/network+partition/transform/remove lines/between clusters
Operations/network+partition/transform/remove lines/between two clusters [0][0]
Network/create partition/components/weak [2]
Partition/canonical/decreasing  !!! for extracting selected components
Info partition
Select Ct'
Operations/network+partition/extract/subnetwork [1]
Draw

Partition for extraction of the main component.

Select canonical
Partition binarize [1]
Info
save partition to Main.clu

Jaccard weights using Pajek

April 6, 2022

select Co (with loops)
Network/Create vector/Get loops
Operations/Network+vector/Transform/Vector -> Line value/initial
File/Network/Change label [Co[e,e]]
select Co (with loops)
Operations/Network+vector/Transform/Vector -> Line value/terminal
File/Network/Change label [Co[f,f]]
Select Co[e,e] as Second
Networks/Cross intersection/add
select Co (with loops) as Second
Networks/Cross intersection/subtract
Select Last as Second
select Co (with loops) as First
Networks/Cross intersection/divide
Network/Create new network/Transform/Remove/loops
Network/Create new network/Transform/Arcs -> edges/bidirected/min
File/Network/Change label [Jaccard]

Temporal co-authorship networks

paper, SetUp

gdir = 'C:/Users/vlado/work/Python/graph/Nets'
wdir = 'C:/Users/vlado/work/Python/WoS/SocNet/2022'
ndir = 'C:/Users/vlado/work/Python/WoS/SocNet/2022'
cdir = 'C:/Users/vlado/work/Python/graph/Nets/chart'
import sys, os, datetime, json
sys.path = [gdir]+sys.path; os.chdir(wdir)
from TQ import *
from Nets import Network as N
net = ndir+"/WAr.net"
clu = ndir+"/YearR.clu"
t1 = datetime.datetime.now(); print("started: ",t1.ctime(),"\n")
WAc = N.twoMode2netsJSON(clu,net,"WAcum.json",instant=False)
t2 = datetime.datetime.now()
print("\nconverted to cumulative TN: ",t2.ctime(),"\ntime used: ", t2-t1)
WAi = N.twoMode2netsJSON(clu,net,"WAins.json",instant=True)
t3 = datetime.datetime.now()
print("\nconverted to instantaneous TN: ",t3.ctime(),"\ntime used: ", t3-t2)
ia = WAi.Index()


Co = WAi.TQtwo2oneCols()
Co.saveNetsJSON("CoIns.json",indent=2)
Co.delLoops()
C = Co.TQtopLinks(thresh=15)

len(C)
C[0]
C[1]
C[2]

tit = C[0][2]+" - "+C[0][3]; bd = C[0][5]
TQmax = 15; Tmin = 2000; Tmax = 2017; w = 600; h = 150
N.TQshow(bd,cdir,TQmax,Tmin,Tmax,w,h,tit,fill="red")

tit = C[2][2]+" - "+C[2][3]; ra = C[2][5]
TQmax = 10; Tmin = 1996; Tmax = 2017; w = 600; h = 150
N.TQshow(ra,cdir,TQmax,Tmin,Tmax,w,h,tit,fill="blue")

TQ.total(bd), TQ.total(ra)

We can compute also fractional temporal co-authorship networks bavla.

To do

AKr = WArT * WKr

tf-idf weights for Main : All. Select 200 most important keywords as Keys. Extract from AKr the subnetwork on Main X Keys.

AJr = WArT * WJr

Instead of tf-idf weights we can use the fractional approach.



Add to Nets the procedure that converts a temporal network into a sequence of temporal slices (in Pajek format).



vlado/work/lnk/bib.txt · Last modified: 2022/04/06 05:22 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki