We first implemented a temporal version of the normalized authorship network from Batagelj, V, Cerinšek, M: On bibliographic networks. Scientometrics 96 (2013) 3, 845-864.
N = n(WA)
def TQnormal(self,key='tq'): N = deepcopy(self) for u in N.nodesMode(1): qu = TQ.TQ.invert(N.TQnetOutDeg(u),vZero=1) for p in N.outStar(u): N._links[p][4][key] = TQ.TQ.prod(qu,N._links[p][4][key]) return N
The program normalize.py
transforms a given temporal two-mode network from the file file
into the
corresponding normalized collaboration network Ct saved to the file fjson
gdir = 'c:/users/batagelj/work/python/graph/graph' # wdir = 'c:/users/batagelj/work/python/graph/JSON/test' wdir = 'c:/users/batagelj/work/python/graph/JSON/SN5' import sys, os, datetime, json sys.path = [gdir]+sys.path; os.chdir(wdir) from GraphNew import Graph # from copy import deepcopy import TQ # file = 'WAtestInst.json' # file = 'WAtestCum.json' file = 'WAcCum.json' t1 = datetime.datetime.now() print("started: ",t1.ctime(),"\n") WA = Graph.loadNetJSON(file) t2 = datetime.datetime.now() print("\nloaded: ",t2.ctime(),"\ntime used: ", t2-t1) N = WA.TQnormal() t3 = datetime.datetime.now() print("\nnormalized: ",t3.ctime(),"\ntime used: ", t3-t2) # Cc = N.TQtwo2oneCols(lType='arc') Cc = N.TQtwo2oneCols() t4 = datetime.datetime.now() print("\nnormalized collaboration: ",t4.ctime(),"\ntime used: ", t4-t3) # fjson = 'CcITest.json' # fjson = 'CcCtest.json' fjson = 'CcCumSN5.json' Cc.saveNetJSON(fjson,indent=2) t5 = datetime.datetime.now() print("\nsave to file: ",t5.ctime(),"\ntime used: ", t5-t4)
As a real data example we used the data set SN5 (2008). The temporal authorship network WA (restricted to works with DC=1) has two versions: the instant WAcInst.json
and the cumulative WAcCum.json
. Both networks have the same sizes |W| = 7950, |A| = 12458 and |L| = 19488.
>>> ====== RESTART: C:/Users/batagelj/work/Python/graph/graph/normalize.py ====== started: Tue Oct 25 18:56:04 2016 loaded: Tue Oct 25 18:56:04 2016 time used: 0:00:00.392023 normalized: Tue Oct 25 18:56:06 2016 time used: 0:00:01.801103 normalized collaboration: Tue Oct 25 18:56:08 2016 time used: 0:00:01.385079 save to file: Tue Oct 25 18:56:12 2016 time used: 0:00:04.236242 >>>
For testing files WAtestInst.json
and WAtestCum.json
can be used.
To compute the pS cores I applied the program PsCoresTq.py
. The program ran into a cycle.
First I suspected that the reason are the rounding errors. But it turned out that the statement
cCore = TQ.TQ.complement(Core[u],Tmin,Tmax)
should be changed to
cCore = TQ.TQ.complement(Core[u],Tmin,Tmax+1)
import sys, os, datetime, json sys.path = [gdir]+sys.path; os.chdir(wdir) import GraphNew as Graph import TQ fJSON = 'CcCumSN5.json' G = Graph.Graph.loadNetJSON(fJSON) G.delLoops() print("Temporal Ps cores in: ",fJSON) t1 = datetime.datetime.now() print("started: ",t1.ctime(),"\n") Tmin,Tmax = G._graph['time'] D = { u: G.TQnetSum(u) for u in G._nodes } Core = { u: [d for d in D[u] if d[2]==0] for u in G.nodes() } D = { u: [d for d in D[u] if d[2]>0] for u in G.nodes() } D = { u: d for u,d in D.items() if d!=[] } Dmin = { u: min([e[2] for e in d]) for u,d in D.items() } step = 0 while len(D)>0: step += 1 dmin,u = min( (v,k) for k,v in Dmin.items() ) if step % 100 == 1: print("{0:3d}. dmin={1:10.4f} node={2:4d}".format(step,dmin,u)) cCore = TQ.TQ.complement(Core[u],Tmin,Tmax+1) core = TQ.TQ.extract(cCore,[d for d in D[u] if d[2] == dmin]) if core!=[]: Core[u] = TQ.TQ.sum(Core[u],core) D[u] = TQ.TQ.cutGE(TQ.TQ.sum(D[u],TQ.TQ.minus(core)),dmin) for link in G.star(u): v = G.twin(u,link) if not(v in D): continue chLink = TQ.TQ.minus(TQ.TQ.extract(core,G.getLink(link,'tq'))) if chLink==[]: continue diff = TQ.TQ.cutGE(TQ.TQ.sum(D[v],chLink),0) D[v] = [ (sd,fd,max(vd,dmin)) for sd,fd,vd in diff ] if len(D[v])==0: del D[v]; del Dmin[v] else: Dmin[v] = min([e[2] for e in D[v]]) if len(D[u])==0: del D[u]; del Dmin[u] else: Dmin[u] = min([e[2] for e in D[u]]) print("{0:3d}. dmin={1:10.4f} node={2:4d}".format(step,dmin,u)) t2 = datetime.datetime.now() print("\nfinished: ",t2.ctime(),"\ntime used: ", t2-t1)
It turned out that the pS cores procedure is quite fast. It produced the cores in 19s.
>>> ====== RESTART: C:\Users\batagelj\work\Python\graph\graph\PsCoresTQ.py ====== Temporal Ps cores in: CcCSN5.json started: Tue Oct 25 17:23:26 2016 1. dmin= 0.0408 node= 548 101. dmin= 0.0476 node=5079 201. dmin= 0.1244 node=9917 301. dmin= 0.1528 node=4316 401. dmin= 0.1653 node=7756 501. dmin= 0.1800 node=7123 601. dmin= 0.1975 node= 873 ... 14201. dmin= 0.9444 node=5560 14301. dmin= 1.0000 node=1056 14401. dmin= 1.0000 node=4485 14501. dmin= 1.0072 node=1358 14601. dmin= 1.1389 node=2051 14701. dmin= 1.2850 node= 337 14801. dmin= 1.3889 node=4560 14901. dmin= 1.5000 node=4045 15001. dmin= 1.7222 node=3130 15101. dmin= 2.1428 node=1924 15201. dmin= 3.0000 node= 796 15261. dmin= 9.7917 node= 20 finished: Tue Oct 25 17:23:45 2016 time used: 0:00:19.644124 >>>
>>> C = TQ.TQ.TQdictCut(Core,3) >>> for v in C: print("{0:3d} : {1:11s} ".format(v,G.getNode(v,'lab')),C[v]) 20 : BORGATTI_S [(1991, 1992, 3.1667), (1992, 1993, 4.1667), (1993, 1994, 5.1667), (1994, 1996, 6.1667), (1996, 1997, 6.6667), (1997, 1999, 7.1667), (1999, 2003, 8.6667), (2003, 2005, 8.7917), (2005, 2006, 9.2917), (2006, 2009, 9.7917)] 3169 : EVERETT_M [(1991, 1992, 3.1667), (1992, 1993, 4.1667), (1993, 1994, 5.1667), (1994, 1996, 6.1667), (1996, 1997, 6.6667), (1999, 2003, 8.6667), (2003, 2005, 8.7917), (2005, 2006, 9.2917), (2006, 2009, 9.7917)] 317 : BERNARD_H [(1990, 1991, 3.0244), (1991, 1995, 3.1494), (1995, 1997, 3.3094), (1997, 1998, 3.3894), (1998, 2001, 3.5494), (2001, 2003, 3.6294), (2003, 2006, 3.685), (2006, 2009, 4.0706)] 2232 : KILLWORT_P [(1990, 1991, 3.0244), (1991, 1995, 3.1494), (1995, 1997, 3.3094), (2003, 2006, 3.685), (2006, 2009, 4.0706)] 4551 : STEINHAU_H [(2003, 2005, 3.0), (2005, 2006, 3.2222), (2006, 2009, 3.6667)] 4860 : METZKE_C [(2003, 2005, 3.0), (2005, 2006, 3.2222), (2006, 2009, 3.6667)] 3125 : SHELLEY_G [(2006, 2009, 3.4767)] 1673 : MCCARTY_C [(2006, 2009, 3.4767)] 1677 : JOHNSEN_E [(2006, 2009, 3.4767)] 75 : HOLLAND_P [(1981, 1983, 3.0), (1983, 2009, 3.2222)] 78 : LEINHARD_S [(1981, 1983, 3.0), (1983, 2009, 3.2222)] 925 : BONACICH_P [(1997, 2009, 3.2222)] 3840 : BIENENST_E [(1997, 2009, 3.2222)] 69 : WASSERMA_S [(2007, 2009, 3.0174)] 1164 : DOREIAN_P [(2007, 2009, 3.0174)] 1166 : HUMMON_N [(2007, 2009, 3.0174)] 1680 : PATTISON_P [(2007, 2009, 3.0174)] 3225 : FARARO_T [(2007, 2009, 3.0174)] 1056 : FAUST_K [(2007, 2009, 3.0174)] 3170 : FERLIGOJ_A [(2007, 2009, 3.0174)] 2083 : ROBINS_G [(2007, 2009, 3.0174)] 2084 : SKVORETZ_J [(2007, 2009, 3.0174)] 949 : BATAGELJ_V [(2007, 2009, 3.0174)] 79 : NEWMAN_M [(2005, 2009, 3.0)] 796 : PARK_J [(2005, 2009, 3.0)] >>>
>>> D = { u: G.TQnetSum(u) for u in G._nodes } >>> S = [ 20, 3169, 1164, 3170, 949, 79 ] >>> for v in S: print("{0:3d} : {1:11s} ".format(v,G.getNode(v,'lab')),D[v]) 20 : BORGATTI_S [(1970, 1988, 0), (1988, 1989, 0.5), (1989, 1990, 1.4444), (1990, 1991, 3.3333), (1991, 1992, 4.2778), (1992, 1993, 5.2778), (1993, 1994, 6.2778), (1994, 1996, 7.7778), (1996, 1997, 8.2778), (1997, 1998, 9.2222), (1998, 1999, 9.5972), (1999, 2001, 11.0972), (2001, 2002, 11.9167), (2002, 2003, 12.3611), (2003, 2005, 14.1111), (2005, 2006, 15.1111), (2006, 2007, 16.0556), (2007, 2009, 16.9204)] 3169 : EVERETT_M [(1970, 1988, 0), (1988, 1989, 1.0), (1989, 1990, 1.9444), (1990, 1991, 3.8333), (1991, 1992, 4.3333), (1992, 1993, 5.3333), (1993, 1994, 6.3333), (1994, 1996, 7.3333), (1996, 1997, 7.8333), (1997, 1999, 8.3333), (1999, 2003, 10.3333), (2003, 2004, 10.7083), (2004, 2005, 11.1528), (2005, 2006, 11.6528), (2006, 2007, 12.1528), (2007, 2009, 12.5972)] 1164 : DOREIAN_P [(1970, 1984, 0), (1984, 1985, 0.5), (1985, 1989, 1.0), (1989, 1990, 1.5), (1990, 1992, 2.4444), (1992, 1994, 3.8333), (1994, 1995, 5.2778), (1995, 1996, 5.7222), (1996, 2000, 7.0972), (2000, 2001, 7.5417), (2001, 2003, 8.0417), (2003, 2004, 8.5417), (2004, 2007, 9.4306), (2007, 2009, 9.4714)] 3170 : FERLIGOJ_A [(1970, 1982, 0), (1982, 1983, 0.5), (1983, 1992, 1.0), (1992, 1994, 2.3889), (1994, 1999, 2.8333), (1999, 2000, 3.3333), (2000, 2001, 3.7778), (2001, 2002, 4.2778), (2002, 2004, 4.6528), (2004, 2005, 6.6389), (2005, 2007, 7.1389), (2007, 2009, 7.1797)] 949 : BATAGELJ_V [(1970, 1982, 0), (1982, 1983, 0.5), (1983, 1992, 1.0), (1992, 1994, 2.3889), (1994, 1999, 2.8333), (1999, 2000, 3.7222), (2000, 2001, 4.6667), (2001, 2002, 5.1667), (2002, 2004, 5.6667), (2004, 2005, 6.5556), (2005, 2007, 7.0556), (2007, 2009, 7.5964)] 79 : NEWMAN_M [(1970, 1999, 0), (1999, 2000, 1.0), (2000, 2001, 2.3194), (2001, 2002, 3.5283), (2002, 2003, 5.80611), (2003, 2004, 7.1811), (2004, 2005, 10.3206), (2005, 2006, 12.01556), (2006, 2007, 15.7239), (2007, 2009, 17.0989)] >>>
>>> u = 3174 # Mrvar >>> for p in G.star(u): v = G.twin(u,p); print(v,G.getNode(v,'lab'),':',G._links[p][4]['tq']) 1164 DOREIAN_P : [[1996, 2009, 0.5]] 4300 ZAVERSNI_M : [[1999, 2009, 0.2222]] 302 WHITE_D : [[1999, 2009, 0.2222]] 949 BATAGELJ_V : [[1999, 2000, 0.4444], [2000, 2001, 0.9444], [2001, 2002, 1.4444], [2002, 2009, 1.9444]] >>>