Differences

This shows you the differences between two versions of the page.

Link to this comparison view

notes:net:comp [2015/07/16 22:32] (current)
vlado created
Line 1: Line 1:
 +====== Compatible two-mode networks from CSV using R ======
 +
 +**October 25, 2014**
 +
 +My Spanish guest Gisela Cantos Mateos prepared from WoS in Excel two tables  Documents X Institutions and Documents X KeyWordsPlus of cleaned data. The problem is that in both tables the set of Documents is not the same - for some documents the data about institutions or keywords were missing. The document's Id (number) is the same in both tables. We need to produce the corresponding pair of compatible (the same Documents set) Pajek's two-mode networks.
 +
 +<code>
 +# transforming two CSV files with different common set into 
 +# compatible 2-mode networks
 +# Vladimir Batagelj, 25. October 2014
 +# source("C:\\Users\\batagelj\\work\\R\\gisela\\gisela.R")
 +
 +getInd <- function(key,dict){
 +  if(!exists(key,envir=dict,inherits=FALSE)) {
 +    assign(key,length(dict)+1,envir=dict) 
 +  } 
 +  return(get(key,envir=dict,inherits=FALSE))
 +}
 +
 +cat('*** Transform:',date(),'\n'); flush.console()
 +setwd("C:/Users/batagelj/work/R/gisela")
 +ins <- read.csv("Institutions.csv",sep=";",colClasses="character")
 +keys <- read.csv("KeyWordsPlus.csv",sep=";",colClasses="character")
 +D <- new.env(hash=TRUE,parent=emptyenv())
 +r <- unlist(lapply(ins$Document,function(x){getInd(x,D)}))
 +s <- unlist(lapply(keys$Document,function(x){getInd(x,D)}))
 +I <- new.env(hash=TRUE,parent=emptyenv()) 
 +K <- new.env(hash=TRUE,parent=emptyenv())
 +p <- unlist(lapply(ins$Institution,function(x){getInd(x,I)}))
 +q <- unlist(lapply(keys$KW.,function(x){getInd(x,K)}))
 +lD <- length(D)
 +WI <- file("WI.net","w")
 +cat('*vertices',lD+length(I),lD,'\n',sep=' ',file=WI)
 +for(e in ls(D)) cat(getInd(e,D),' "',e,'"\n',sep='',file=WI)
 +for(e in ls(I)) cat(lD+getInd(e,I),' "',e,'"\n',sep='',file=WI)
 +cat('*arcs\n',file=WI)
 +for(i in 1:length(r)) cat(r[i],lD+p[i],'\n',file=WI)
 +close(WI) 
 +WK <- file("WK.net","w")
 +cat('*vertices',lD+length(K),lD,'\n',sep=' ',file=WK)
 +for(e in ls(D)) cat(getInd(e,D),' "',e,'"\n',sep='',file=WK)
 +for(e in ls(K)) cat(lD+getInd(e,K),' "',e,'"\n',sep='',file=WK)
 +cat('*arcs\n',file=WK)
 +for(i in 1:length(s)) cat(s[i],lD+q[i],'\n',file=WK)
 +close(WK)
 +cat('*** Finished:',date(),'\n\n') 
 +</code>
 +
 +{{notes:zip:gisela.zip}}
  
notes/net/comp.txt · Last modified: 2015/07/16 22:32 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki