Compatible two-mode networks from CSV using R

October 25, 2014

My Spanish guest Gisela Cantos Mateos prepared from WoS in Excel two tables Documents X Institutions and Documents X KeyWordsPlus of cleaned data. The problem is that in both tables the set of Documents is not the same - for some documents the data about institutions or keywords were missing. The document's Id (number) is the same in both tables. We need to produce the corresponding pair of compatible (the same Documents set) Pajek's two-mode networks.

# transforming two CSV files with different common set into 
# compatible 2-mode networks
# Vladimir Batagelj, 25. October 2014
# source("C:\\Users\\batagelj\\work\\R\\gisela\\gisela.R")

getInd <- function(key,dict){
  if(!exists(key,envir=dict,inherits=FALSE)) {
    assign(key,length(dict)+1,envir=dict) 
  } 
  return(get(key,envir=dict,inherits=FALSE))
}

cat('*** Transform:',date(),'\n'); flush.console()
setwd("C:/Users/batagelj/work/R/gisela")
ins <- read.csv("Institutions.csv",sep=";",colClasses="character")
keys <- read.csv("KeyWordsPlus.csv",sep=";",colClasses="character")
D <- new.env(hash=TRUE,parent=emptyenv())
r <- unlist(lapply(ins$Document,function(x){getInd(x,D)}))
s <- unlist(lapply(keys$Document,function(x){getInd(x,D)}))
I <- new.env(hash=TRUE,parent=emptyenv()) 
K <- new.env(hash=TRUE,parent=emptyenv())
p <- unlist(lapply(ins$Institution,function(x){getInd(x,I)}))
q <- unlist(lapply(keys$KW.,function(x){getInd(x,K)}))
lD <- length(D)
WI <- file("WI.net","w")
cat('*vertices',lD+length(I),lD,'\n',sep=' ',file=WI)
for(e in ls(D)) cat(getInd(e,D),' "',e,'"\n',sep='',file=WI)
for(e in ls(I)) cat(lD+getInd(e,I),' "',e,'"\n',sep='',file=WI)
cat('*arcs\n',file=WI)
for(i in 1:length(r)) cat(r[i],lD+p[i],'\n',file=WI)
close(WI) 
WK <- file("WK.net","w")
cat('*vertices',lD+length(K),lD,'\n',sep=' ',file=WK)
for(e in ls(D)) cat(getInd(e,D),' "',e,'"\n',sep='',file=WK)
for(e in ls(K)) cat(lD+getInd(e,K),' "',e,'"\n',sep='',file=WK)
cat('*arcs\n',file=WK)
for(i in 1:length(s)) cat(s[i],lD+q[i],'\n',file=WK)
close(WK)
cat('*** Finished:',date(),'\n\n') 

gisela.zip

notes/comp.txt · Last modified: 2015/07/16 21:04 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki