Financial networks

Transforming sequence of transactions into temporal network

July 7, 2020

The event data could be encoded as temporal quantities with 'granularity” 1 second (or even smaller). For practical reasons, I selected granularity 1 day. The program can be easily adapted for weeks or months. Another interesting encoding would be by days of the week (Mon, Tue, …, Sun) maybe for quarters of the year.

The network arcs have two weights:

  • 'tq' - temporal quantity describing transactions - day + value
  • 'nt' - number of transactions
gdir = "c:/users/batagelj/work/python/graph/Nets"
wdir = "C:/Users/batagelj/Downloads/data/lomi"
import sys, os, re, json
sys.path = [gdir]+sys.path; os.chdir(wdir)
from TQ import *
from Nets import Network as N
from datetime import datetime

import pandas
df = pandas.read_csv('data_2006.txt')
SN = df.Sender.value_counts(); RN = df.Receiver.value_counts()
R = set(RN.keys()); S = set(SN.keys())
U = R | S

G = N(simple=True,temporal=True,network='year2006',title='2006')
minT = 1; maxT = 365
G._info['meta'] = [{"date":datetime.now().ctime(),"title":"CSV2netsJSON"}]
G._info['directed'] = True; G._info['time'] = (minT, maxT)
for i,e in enumerate(U):
  G.addNode(i,1); G._nodes[i][3] = {'lab':e,'act':[(minT,maxT+1,1)]}
I = G.Index()

ds = '2006-01-01T00:00:00'; start = datetime.strptime(ds,'%Y-%m-%dT%H:%M:%S')
cs = int(start.timestamp())
for r in range(len(df)):
  u = I[df.Sender[r]]; v = I[df.Receiver[r]]; a = (u,v)
  if not a in G._links: G.addArc(u,v,w={'nt':0,'tq':[]},lid=a)
  datum = datetime.strptime(df.Date[r]+'T'+df.Time[r],'%Y-%m-%dT%H:%M:%S')
  cc = int(datum.timestamp()); daY = 1 + int((cc-cs)/86400)
  tq = [ (daY, daY+1, df.Amount[r]) ]
  G._links[a][4]['tq'] = TQ.sum(G._links[a][4]['tq'],tq)
  G._links[a][4]['nt'] += 1
G.Info()
G.saveNetsJSON('data_2006.json',indent=1)
G.savePajek('data_2006-nt.net',key='nt')
G.savePajek('data_2006-totq.net',key='tq')

Some analyses

July 8-9, 2020

The conversion program produces a temporal network

>>> G.Info()
network:  year2006 
 2006 
simple= True  directed= True  org= 1  mode= 1  multirel= False  temporal= True 
nodes= 172  links= 5014  arcs= 5014  edges= 0
Tmin= 1  Tmax= 365
>>>

with 172 nodes and 5014 arcs. The arcs have two weights: the total number number of transactions 'nt' and the temporal quantity 'tq' describing daily amount of transactions. The network is saved in netsJSON format. Besides this it is saved in Pajek format once with weights 'nt' and the second time with weights equal to the total yearly amount of transactions.

For example, I decided to draw the 'nt' network with Pajek. Because the range of weights is very large, from 1 to 1627, I transformed them using SQRT (it preserves the order and reduces sizes). Using a spring embedder I got the picture

It is not very readable but we can see that most of the transactions are among Italian banks. Banks from other countries are on the right side. A more detailed structure could be revealed using clustering.

In fact, I exported the picture in 2006 sqrt(nt) SVG that allows its inspection by values intervals clicking on dots in upper left corner. We get, for example

We see that the largest number of transactions is from 'IT0269' to 'IT0268'. Using some additional programming

gdir = "c:/users/batagelj/work/python/graph/Nets"
wdir = "C:/Users/batagelj/Downloads/data/lomi"
cdir = "C:/Users/batagelj/Downloads/data/lomi/chart"

import sys, os, re, json
sys.path = [gdir]+sys.path; os.chdir(wdir)
from TQ import *
from Nets import Network as N
from datetime import datetime

def drawLinkTQ(u,v,fill='red',TQmax=None):
   global I, Tmin, Tmax, w, h
   tit = u+" -> "+v
   try:
      tq = G._links[G._nodes[I[u]][2][I[v]][0]][4]['tq']
      if TQmax is None: TQmax = 1.05*TQ.TqSummary(tq)[3]
      N.TQshow(tq,cdir,TQmax,Tmin,Tmax,w,h,tit,fill=fill)
   except:
      print("no link:",tit)

def drawNodeTQ(u,key,fill='red',TQmax=None):
   global I, Tmin, Tmax, w, h
   try:
      tq = G._nodes[I[u]][3][key]
      if TQmax is None: TQmax = 1.05*TQ.TqSummary(tq)[3]
      N.TQshow(tq,cdir,TQmax,Tmin,Tmax,w,h,u,fill=fill)
   except:
      print("error in node:",u)
      
G = N.loadNetsJSON('data_2006.json')
G.Info()

I = G.Index()
Tmin = 1; Tmax = 365; w = 800; h = 200
nu = 'IT0269'; nv = 'IT0268'
drawLinkTQ(nu,nv)
drawLinkTQ('IT0239','GB0023',fill='blue')

for u in G.nodes(): G.setNode(u,'outTQ',G.TQnetOutSum(u,key='tq'))
for u in G.nodes(): G.setNode(u,'inTQ',G.TQnetInSum(u,key='tq'))
TQ.total(G._nodes[iu][3]['outTQ'])
# 198224.38
TQ.total(G._nodes[iu][3]['inTQ'])
# 2046.6499999999999
drawNodeTQ(nu,'outTQ',fill='orange')

we can display the temporal quantity describing daily amounts of transactions from one bank to the other, for example from 'IT0269' to 'IT0268'

or daily amounts of all outgoing transactions from a selected bank, for example for 'IT0269'

or daily amounts of all incoming transactions to a selected bank, for example for 'IT0269'

vlado/work/alg/fin.txt · Last modified: 2020/07/09 23:42 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki