====== Important bibliographic subnetworks ======
March 2022
Construction of important bibliographic subnetworks from a collection of bibliographic networks obtained from WoS using [[https://github.com/bavla/biblio/blob/master/WoS2Pajek/WoS2Pajek14.pdf|WoS2Pajek]].
===== SocNet 2018 networks =====
For illustration, we shall use the collection on network analysis (2018) [[https://link.springer.com/article/10.1007/s11192-019-03193-x|paper]], [[https://github.com/bavla/SocNet/tree/master/nets|files]].
^ Network ^ # Nodes (sum) ^ # Mode 1 ^ # Mode 2 ^ # Arcs ^
| CiteN | 1,297,133 | | | 2,753,633 |
| CiteR | 70,792 | | | 398,199 |
| WAn | 1,693,104 | 1,297,133 | 395,971 | 1,442,240 |
| WAr | 163,803 | 70,792 | 93,011 | 215,901 |
| WKn | 1,329,542 | 1,297,133 | 32,409 | 1,167,666 |
| WKr | 103,201 | 70,792 | 32,409 | 1,167,666 |
| WJn | 1,366,279 | 1,297,133 | 69,146 | 720,044 |
| WJr | 79,735 | 70,792 | 8943 | 61,741 |
===== Co - First Co-authorship network =====
**Co** = **WAr**T * **WAr**
Co[a,b] = # works that authors a and b co-authored
Co[a,a] = # works of author a
Problem: **Co** is a sum of complete subgraphs spanned on co-authors of each work. Works with large number of co-authors blur the picture.
read WAr
info
2-Mode Network: Rows=70792, Cols=93011
Network/Create vector/centrality/degree/input
Info vector [+200]
chinese
Network/2-mode/transpose
select WAr as second
Networks/multiply [yes]
Network/crate new/tranform/remove/loops [yes]
Network/crate new/tranform/arcs->edges/bidirected/min [no]
File/Network/change label [Co]
Network/Info/Line values
Line Values Frequency Freq% CumFreq CumFreq%
----------------------------------------------------------------------------
( ... 1.0000] 251440 86.1772 251440 86.1772
( 1.0000 ... 2.0000] 27072 9.2785 278512 95.4557
( 2.0000 ... 3.0000] 7252 2.4855 285764 97.9412
( 3.0000 ... 4.0000] 2895 0.9922 288659 98.9334
( 4.0000 ... 5.0000] 1350 0.4627 290009 99.3961
( 5.0000 ... 6.0000] 720 0.2468 290729 99.6429
( 6.0000 ... 7.0000] 329 0.1128 291058 99.7556
( 7.0000 ... 8.0000] 225 0.0771 291283 99.8327
( 8.0000 ... 9.0000] 143 0.0490 291426 99.8818
( 9.0000 ... 10.0000] 86 0.0295 291512 99.9112
( 10.0000 ... 11.0000] 58 0.0199 291570 99.9311
( 11.0000 ... 12.0000] 57 0.0195 291627 99.9506
( 12.0000 ... 13.0000] 32 0.0110 291659 99.9616
( 13.0000 ... 14.0000] 23 0.0079 291682 99.9695
( 14.0000 ... 15.0000] 17 0.0058 291699 99.9753
( 15.0000 ... 16.0000] 9 0.0031 291708 99.9784
( 16.0000 ... 17.0000] 9 0.0031 291717 99.9815
( 17.0000 ... 18.0000] 5 0.0017 291722 99.9832
( 18.0000 ... 19.0000] 8 0.0027 291730 99.9859
( 19.0000 ... 20.0000] 4 0.0014 291734 99.9873
( 20.0000 ... 21.0000] 1 0.0003 291735 99.9877
( 21.0000 ... 22.0000] 13 0.0045 291748 99.9921
( 22.0000 ... 23.0000] 4 0.0014 291752 99.9935
( 23.0000 ... 24.0000] 2 0.0007 291754 99.9942
( 24.0000 ... 25.0000] 3 0.0010 291757 99.9952
( 25.0000 ... 26.0000] 1 0.0003 291758 99.9955
( 26.0000 ... 27.0000] 3 0.0010 291761 99.9966
( 27.0000 ... 28.0000] 1 0.0003 291762 99.9969
( 28.0000 ... 29.0000] 1 0.0003 291763 99.9973
( 29.0000 ... 30.0000] 1 0.0003 291764 99.9976
( 30.0000 ... 31.0000] 3 0.0010 291767 99.9986
( 31.0000 ... 32.0000] 1 0.0003 291768 99.9990
( 32.0000 ... 33.0000] 1 0.0003 291769 99.9993
( 33.0000 ... 34.0000] 0 0.0000 291769 99.9993
( 34.0000 ... 35.0000] 0 0.0000 291769 99.9993
( 35.0000 ... 36.0000] 0 0.0000 291769 99.9993
( 36.0000 ... 37.0000] 0 0.0000 291769 99.9993
( 37.0000 ... 38.0000] 1 0.0003 291770 99.9997
( 38.0000 ... 39.0000] 0 0.0000 291770 99.9997
( 39.0000 ... 40.0000] 0 0.0000 291770 99.9997
( 40.0000 ... 41.0000] 0 0.0000 291770 99.9997
( 41.0000 ... 42.0000] 0 0.0000 291770 99.9997
( 42.0000 ... 43.0000] 1 0.0003 291771 100.0000
----------------------------------------------------------------------------
The most active co-authors.
% edge cut at 10
Network/crate new/tranform/remove/lines with value/lower than [10][yes]
Network/create partition/degree/all
File/partition/change label [cut 10]
Operations/network+partition/extract/subnetwork [1-*]
Draw/network+first partition
Mostly small components. To get more structure we must lower the threshold.
Select Co
Network/create new/tranform/remove/lines with value/lower than [5][yes]
Network/create partition/components/weak [5]
Partition/canonical/decreasing
Info partition
Operations/network+partition/extract/subnetwork [1-*]
Draw/network+first partition
Large (661 nodes) component of mainly Chinese authors. We exclude them.
Select canonical partition
Partition/binarize [2-*]
Select partition cut 10 as second
Partitions max
Select Co
Operations/network+partition/extract/subnetwork [1-*]
Network/create partition/components/weak [1]
Draw/network+first partition
Large component with the main authors.
Another option is to make a list (partition) of "interesting" authors (for example SNA) and
extract the corresponding subnetwork from **Co**.
===== Ct' - Newman's strict co-authorship =====
The works with many co-authors are overrepresented (contribute more to the total weight) in the network **Co**. To make contributions equal we use a [[https://link.springer.com/article/10.1007/s11192-020-03383-y|fractional approach]] - normalization.
Normalization of (2-mode) binary network **N**, n[w,a] ∈ {0,1} [[https://github.com/bavla/biblio/tree/master/Pajek/macro|macros]]
n(**N**)[w,a] = n[w,a]/max(1,outdeg(w))
n'(**N**)[w,a] = n[w,a]/max(1,outdeg(w)-1)
Ct' = n(**WAr**)T * n'(**WAr**)
Select WAr
Macro/play/ norm2p [70792] -> n'(WAr)
Select WAr
Run macro norm2 [70792]
Network/2-mode/transpose
Select n'(WAr) as second
Networks/Multiply [yes]
Network/crate new/tranform/remove/loops [yes]
Network/crate new/tranform/arcs->edges/bidirected/sum [no]
File/Network/change label [Ct']
Network/Info/Line values [#25]
Line Values Frequency Freq% CumFreq CumFreq%
---------------------------------------------------------------------------
( ... 0.0001] 7861 2.6942 7861 2.6942
( 0.0001 ... 1.0348] 278858 95.5743 286719 98.2685
( 1.0348 ... 2.0696] 3943 1.3514 290662 99.6199
( 2.0696 ... 3.1043] 751 0.2574 291413 99.8773
( 3.1043 ... 4.1390] 216 0.0740 291629 99.9513
( 4.1390 ... 5.1737] 58 0.0199 291687 99.9712
( 5.1737 ... 6.2084] 44 0.0151 291731 99.9863
( 6.2084 ... 7.2432] 23 0.0079 291754 99.9942
( 7.2432 ... 8.2779] 7 0.0024 291761 99.9966
( 8.2779 ... 9.3126] 3 0.0010 291764 99.9976
( 9.3126 ... 10.3473] 3 0.0010 291767 99.9986
( 10.3473 ... 11.3820] 1 0.0003 291768 99.9990
( 11.3820 ... 12.4167] 0 0.0000 291768 99.9990
( 12.4167 ... 13.4515] 0 0.0000 291768 99.9990
( 13.4515 ... 14.4862] 0 0.0000 291768 99.9990
( 14.4862 ... 15.5209] 1 0.0003 291769 99.9993
( 15.5209 ... 16.5556] 0 0.0000 291769 99.9993
( 16.5556 ... 17.5903] 1 0.0003 291770 99.9997
( 17.5903 ... 18.6250] 0 0.0000 291770 99.9997
( 18.6250 ... 19.6597] 0 0.0000 291770 99.9997
( 19.6597 ... 20.6945] 0 0.0000 291770 99.9997
( 20.6945 ... 21.7292] 0 0.0000 291770 99.9997
( 21.7292 ... 22.7639] 0 0.0000 291770 99.9997
( 22.7639 ... 23.7986] 0 0.0000 291770 99.9997
( 23.7986 ... 24.8333] 1 0.0003 291771 100.0000
---------------------------------------------------------------------------
The threshold is now much lower.
Ct' edge cut 4
Ct' edge cut 2
cut 2 components >= 5
union
Ct' extract union
Partition weak
Draw
Partition for extraction of selected components.
select union
Partition binarize [1-*]
Operations/network+partition/transform/remove lines/between clusters
Operations/network+partition/transform/remove lines/between two clusters [0][0]
Network/create partition/components/weak [2]
Partition/canonical/decreasing !!! for extracting selected components
Info partition
Select Ct'
Operations/network+partition/extract/subnetwork [1]
Draw
Partition for extraction of the main component.
Select canonical
Partition binarize [1]
Info
save partition to Main.clu
===== Jaccard weights using Pajek =====
April 6, 2022
select Co (with loops)
Network/Create vector/Get loops
Operations/Network+vector/Transform/Vector -> Line value/initial
File/Network/Change label [Co[e,e]]
select Co (with loops)
Operations/Network+vector/Transform/Vector -> Line value/terminal
File/Network/Change label [Co[f,f]]
Select Co[e,e] as Second
Networks/Cross intersection/add
select Co (with loops) as Second
Networks/Cross intersection/subtract
Select Last as Second
select Co (with loops) as First
Networks/Cross intersection/divide
Network/Create new network/Transform/Remove/loops
Network/Create new network/Transform/Arcs -> edges/bidirected/min
File/Network/Change label [Jaccard]
===== Temporal co-authorship networks =====
[[https://www.sciencedirect.com/science/article/pii/S1751157719301439|paper]],
[[https://github.com/bavla/SocNet/wiki/SetUp|SetUp]]
gdir = 'C:/Users/vlado/work/Python/graph/Nets'
wdir = 'C:/Users/vlado/work/Python/WoS/SocNet/2022'
ndir = 'C:/Users/vlado/work/Python/WoS/SocNet/2022'
cdir = 'C:/Users/vlado/work/Python/graph/Nets/chart'
import sys, os, datetime, json
sys.path = [gdir]+sys.path; os.chdir(wdir)
from TQ import *
from Nets import Network as N
net = ndir+"/WAr.net"
clu = ndir+"/YearR.clu"
t1 = datetime.datetime.now(); print("started: ",t1.ctime(),"\n")
WAc = N.twoMode2netsJSON(clu,net,"WAcum.json",instant=False)
t2 = datetime.datetime.now()
print("\nconverted to cumulative TN: ",t2.ctime(),"\ntime used: ", t2-t1)
WAi = N.twoMode2netsJSON(clu,net,"WAins.json",instant=True)
t3 = datetime.datetime.now()
print("\nconverted to instantaneous TN: ",t3.ctime(),"\ntime used: ", t3-t2)
ia = WAi.Index()
Co = WAi.TQtwo2oneCols()
Co.saveNetsJSON("CoIns.json",indent=2)
Co.delLoops()
C = Co.TQtopLinks(thresh=15)
len(C)
C[0]
C[1]
C[2]
tit = C[0][2]+" - "+C[0][3]; bd = C[0][5]
TQmax = 15; Tmin = 2000; Tmax = 2017; w = 600; h = 150
N.TQshow(bd,cdir,TQmax,Tmin,Tmax,w,h,tit,fill="red")
tit = C[2][2]+" - "+C[2][3]; ra = C[2][5]
TQmax = 10; Tmin = 1996; Tmax = 2017; w = 600; h = 150
N.TQshow(ra,cdir,TQmax,Tmin,Tmax,w,h,tit,fill="blue")
TQ.total(bd), TQ.total(ra)
We can compute also fractional temporal co-authorship networks [[https://github.com/bavla/SocNet/wiki/WAt#using-python-for-graphs|bavla]].
===== To do =====
**AKr** = **WAr**T * **WKr**
[[https://en.wikipedia.org/wiki/Tf%E2%80%93idf|tf-idf]] weights for Main : All. Select 200 most important keywords as Keys. Extract from **AKr** the subnetwork on Main X Keys.
**AJr** = **WAr**T * **WJr**
Instead of tf-idf weights we can use the fractional approach.
Add to Nets the procedure that converts a temporal network into a sequence of temporal slices (in Pajek format).