March 2022
Construction of important bibliographic subnetworks from a collection of bibliographic networks obtained from WoS using WoS2Pajek.
For illustration, we shall use the collection on network analysis (2018) paper, files.
Network | # Nodes (sum) | # Mode 1 | # Mode 2 | # Arcs |
---|---|---|---|---|
CiteN | 1,297,133 | 2,753,633 | ||
CiteR | 70,792 | 398,199 | ||
WAn | 1,693,104 | 1,297,133 | 395,971 | 1,442,240 |
WAr | 163,803 | 70,792 | 93,011 | 215,901 |
WKn | 1,329,542 | 1,297,133 | 32,409 | 1,167,666 |
WKr | 103,201 | 70,792 | 32,409 | 1,167,666 |
WJn | 1,366,279 | 1,297,133 | 69,146 | 720,044 |
WJr | 79,735 | 70,792 | 8943 | 61,741 |
Co = WArT * WAr
Co[a,b] = # works that authors a and b co-authored
Co[a,a] = # works of author a
Problem: Co is a sum of complete subgraphs spanned on co-authors of each work. Works with large number of co-authors blur the picture.
read WAr info 2-Mode Network: Rows=70792, Cols=93011 Network/Create vector/centrality/degree/input Info vector [+200] chinese Network/2-mode/transpose select WAr as second Networks/multiply [yes] Network/crate new/tranform/remove/loops [yes] Network/crate new/tranform/arcs->edges/bidirected/min [no] File/Network/change label [Co] Network/Info/Line values
Line Values Frequency Freq% CumFreq CumFreq% ---------------------------------------------------------------------------- ( ... 1.0000] 251440 86.1772 251440 86.1772 ( 1.0000 ... 2.0000] 27072 9.2785 278512 95.4557 ( 2.0000 ... 3.0000] 7252 2.4855 285764 97.9412 ( 3.0000 ... 4.0000] 2895 0.9922 288659 98.9334 ( 4.0000 ... 5.0000] 1350 0.4627 290009 99.3961 ( 5.0000 ... 6.0000] 720 0.2468 290729 99.6429 ( 6.0000 ... 7.0000] 329 0.1128 291058 99.7556 ( 7.0000 ... 8.0000] 225 0.0771 291283 99.8327 ( 8.0000 ... 9.0000] 143 0.0490 291426 99.8818 ( 9.0000 ... 10.0000] 86 0.0295 291512 99.9112 ( 10.0000 ... 11.0000] 58 0.0199 291570 99.9311 ( 11.0000 ... 12.0000] 57 0.0195 291627 99.9506 ( 12.0000 ... 13.0000] 32 0.0110 291659 99.9616 ( 13.0000 ... 14.0000] 23 0.0079 291682 99.9695 ( 14.0000 ... 15.0000] 17 0.0058 291699 99.9753 ( 15.0000 ... 16.0000] 9 0.0031 291708 99.9784 ( 16.0000 ... 17.0000] 9 0.0031 291717 99.9815 ( 17.0000 ... 18.0000] 5 0.0017 291722 99.9832 ( 18.0000 ... 19.0000] 8 0.0027 291730 99.9859 ( 19.0000 ... 20.0000] 4 0.0014 291734 99.9873 ( 20.0000 ... 21.0000] 1 0.0003 291735 99.9877 ( 21.0000 ... 22.0000] 13 0.0045 291748 99.9921 ( 22.0000 ... 23.0000] 4 0.0014 291752 99.9935 ( 23.0000 ... 24.0000] 2 0.0007 291754 99.9942 ( 24.0000 ... 25.0000] 3 0.0010 291757 99.9952 ( 25.0000 ... 26.0000] 1 0.0003 291758 99.9955 ( 26.0000 ... 27.0000] 3 0.0010 291761 99.9966 ( 27.0000 ... 28.0000] 1 0.0003 291762 99.9969 ( 28.0000 ... 29.0000] 1 0.0003 291763 99.9973 ( 29.0000 ... 30.0000] 1 0.0003 291764 99.9976 ( 30.0000 ... 31.0000] 3 0.0010 291767 99.9986 ( 31.0000 ... 32.0000] 1 0.0003 291768 99.9990 ( 32.0000 ... 33.0000] 1 0.0003 291769 99.9993 ( 33.0000 ... 34.0000] 0 0.0000 291769 99.9993 ( 34.0000 ... 35.0000] 0 0.0000 291769 99.9993 ( 35.0000 ... 36.0000] 0 0.0000 291769 99.9993 ( 36.0000 ... 37.0000] 0 0.0000 291769 99.9993 ( 37.0000 ... 38.0000] 1 0.0003 291770 99.9997 ( 38.0000 ... 39.0000] 0 0.0000 291770 99.9997 ( 39.0000 ... 40.0000] 0 0.0000 291770 99.9997 ( 40.0000 ... 41.0000] 0 0.0000 291770 99.9997 ( 41.0000 ... 42.0000] 0 0.0000 291770 99.9997 ( 42.0000 ... 43.0000] 1 0.0003 291771 100.0000 ----------------------------------------------------------------------------
The most active co-authors.
% edge cut at 10 Network/crate new/tranform/remove/lines with value/lower than [10][yes] Network/create partition/degree/all File/partition/change label [cut 10] Operations/network+partition/extract/subnetwork [1-*] Draw/network+first partition
Mostly small components. To get more structure we must lower the threshold.
Select Co Network/create new/tranform/remove/lines with value/lower than [5][yes] Network/create partition/components/weak [5] Partition/canonical/decreasing Info partition Operations/network+partition/extract/subnetwork [1-*] Draw/network+first partition
Large (661 nodes) component of mainly Chinese authors. We exclude them.
Select canonical partition Partition/binarize [2-*] Select partition cut 10 as second Partitions max Select Co Operations/network+partition/extract/subnetwork [1-*] Network/create partition/components/weak [1] Draw/network+first partition
Large component with the main authors.
Another option is to make a list (partition) of “interesting” authors (for example SNA) and extract the corresponding subnetwork from Co.
The works with many co-authors are overrepresented (contribute more to the total weight) in the network Co. To make contributions equal we use a fractional approach - normalization.
Normalization of (2-mode) binary network N, n[w,a] ∈ {0,1} macros
n(N)[w,a] = n[w,a]/max(1,outdeg(w))
n'(N)[w,a] = n[w,a]/max(1,outdeg(w)-1)
Ct' = n(WAr)T * n'(WAr)
Select WAr Macro/play/ norm2p [70792] -> n'(WAr) Select WAr Run macro norm2 [70792] Network/2-mode/transpose Select n'(WAr) as second Networks/Multiply [yes] Network/crate new/tranform/remove/loops [yes] Network/crate new/tranform/arcs->edges/bidirected/sum [no] File/Network/change label [Ct'] Network/Info/Line values [#25]
Line Values Frequency Freq% CumFreq CumFreq% --------------------------------------------------------------------------- ( ... 0.0001] 7861 2.6942 7861 2.6942 ( 0.0001 ... 1.0348] 278858 95.5743 286719 98.2685 ( 1.0348 ... 2.0696] 3943 1.3514 290662 99.6199 ( 2.0696 ... 3.1043] 751 0.2574 291413 99.8773 ( 3.1043 ... 4.1390] 216 0.0740 291629 99.9513 ( 4.1390 ... 5.1737] 58 0.0199 291687 99.9712 ( 5.1737 ... 6.2084] 44 0.0151 291731 99.9863 ( 6.2084 ... 7.2432] 23 0.0079 291754 99.9942 ( 7.2432 ... 8.2779] 7 0.0024 291761 99.9966 ( 8.2779 ... 9.3126] 3 0.0010 291764 99.9976 ( 9.3126 ... 10.3473] 3 0.0010 291767 99.9986 ( 10.3473 ... 11.3820] 1 0.0003 291768 99.9990 ( 11.3820 ... 12.4167] 0 0.0000 291768 99.9990 ( 12.4167 ... 13.4515] 0 0.0000 291768 99.9990 ( 13.4515 ... 14.4862] 0 0.0000 291768 99.9990 ( 14.4862 ... 15.5209] 1 0.0003 291769 99.9993 ( 15.5209 ... 16.5556] 0 0.0000 291769 99.9993 ( 16.5556 ... 17.5903] 1 0.0003 291770 99.9997 ( 17.5903 ... 18.6250] 0 0.0000 291770 99.9997 ( 18.6250 ... 19.6597] 0 0.0000 291770 99.9997 ( 19.6597 ... 20.6945] 0 0.0000 291770 99.9997 ( 20.6945 ... 21.7292] 0 0.0000 291770 99.9997 ( 21.7292 ... 22.7639] 0 0.0000 291770 99.9997 ( 22.7639 ... 23.7986] 0 0.0000 291770 99.9997 ( 23.7986 ... 24.8333] 1 0.0003 291771 100.0000 ---------------------------------------------------------------------------
The threshold is now much lower.
Ct' edge cut 4 Ct' edge cut 2 cut 2 components >= 5 union Ct' extract union Partition weak Draw
Partition for extraction of selected components.
select union Partition binarize [1-*] Operations/network+partition/transform/remove lines/between clusters Operations/network+partition/transform/remove lines/between two clusters [0][0] Network/create partition/components/weak [2] Partition/canonical/decreasing !!! for extracting selected components Info partition Select Ct' Operations/network+partition/extract/subnetwork [1] Draw
Partition for extraction of the main component.
Select canonical Partition binarize [1] Info save partition to Main.clu
April 6, 2022
select Co (with loops) Network/Create vector/Get loops Operations/Network+vector/Transform/Vector -> Line value/initial File/Network/Change label [Co[e,e]] select Co (with loops) Operations/Network+vector/Transform/Vector -> Line value/terminal File/Network/Change label [Co[f,f]] Select Co[e,e] as Second Networks/Cross intersection/add select Co (with loops) as Second Networks/Cross intersection/subtract Select Last as Second select Co (with loops) as First Networks/Cross intersection/divide Network/Create new network/Transform/Remove/loops Network/Create new network/Transform/Arcs -> edges/bidirected/min File/Network/Change label [Jaccard]
gdir = 'C:/Users/vlado/work/Python/graph/Nets' wdir = 'C:/Users/vlado/work/Python/WoS/SocNet/2022' ndir = 'C:/Users/vlado/work/Python/WoS/SocNet/2022' cdir = 'C:/Users/vlado/work/Python/graph/Nets/chart' import sys, os, datetime, json sys.path = [gdir]+sys.path; os.chdir(wdir) from TQ import * from Nets import Network as N net = ndir+"/WAr.net" clu = ndir+"/YearR.clu" t1 = datetime.datetime.now(); print("started: ",t1.ctime(),"\n") WAc = N.twoMode2netsJSON(clu,net,"WAcum.json",instant=False) t2 = datetime.datetime.now() print("\nconverted to cumulative TN: ",t2.ctime(),"\ntime used: ", t2-t1) WAi = N.twoMode2netsJSON(clu,net,"WAins.json",instant=True) t3 = datetime.datetime.now() print("\nconverted to instantaneous TN: ",t3.ctime(),"\ntime used: ", t3-t2) ia = WAi.Index() Co = WAi.TQtwo2oneCols() Co.saveNetsJSON("CoIns.json",indent=2) Co.delLoops() C = Co.TQtopLinks(thresh=15) len(C) C[0] C[1] C[2] tit = C[0][2]+" - "+C[0][3]; bd = C[0][5] TQmax = 15; Tmin = 2000; Tmax = 2017; w = 600; h = 150 N.TQshow(bd,cdir,TQmax,Tmin,Tmax,w,h,tit,fill="red") tit = C[2][2]+" - "+C[2][3]; ra = C[2][5] TQmax = 10; Tmin = 1996; Tmax = 2017; w = 600; h = 150 N.TQshow(ra,cdir,TQmax,Tmin,Tmax,w,h,tit,fill="blue") TQ.total(bd), TQ.total(ra)
We can compute also fractional temporal co-authorship networks bavla.
AKr = WArT * WKr
tf-idf weights for Main : All. Select 200 most important keywords as Keys. Extract from AKr the subnetwork on Main X Keys.
AJr = WArT * WJr
Instead of tf-idf weights we can use the fractional approach.
Add to Nets the procedure that converts a temporal network into a sequence of temporal slices (in Pajek format).