====== R functions for Pajek's networks ======
[[notes:clu:cluster|Structure of clustering data in R]]
===== Reading Pajek's net file =====
July 24, 2017
See also https://github.com/bavla/Rnet/tree/master/R
==== Using Pajek ====
The simplest way to transfer a Pajek's network into R is to use Pajek's command
Tools/R/Send to R/Current Network
In R we can use also the following commands:
Use objects() to get list of available objects
Use comment(?) to get information about selected object
Use savevector(v?,'???.vec') to save vector to Pajek input file
Use savematrix(n?,'???.net') to save matrix to Pajek input file (.MAT)
savematrix(n?,'???.net',2) to request a 2-mode matrix (.MAT)
Use savenetwork(n?,'???.net') to save matrix to Pajek input file (.NET)
savenetwork(n?,'???.net',2) to request a 2-mode network (.NET)
Use v?<-loadvector('???.vec') to load vector(s) from Pajek input file
Use n?<-loadmatrix('???.mat') to load matrix from Pajek input file
There are problems with this approach if the node labels contain Unicode characters. In Pajek they are represented by XML entities. For example the label ''МАКАГОНОВА Н'' is represented as ''МАКАГОНОВАН''. To convert them in R into Unicode we can use the library ''xml2''
> library(xml2)
> xml2utf8 <- function(str){
+ t <- xml2::xml_text(xml2::read_xml(paste0("", str, "")))
+ Encoding(t) <- "UTF-8"
+ return (t)
+ }
Now we can "correct" the network description (suppose that the network is stored in n2 with values from {0,1,2,3})
> N <- unlist(lapply(rownames(n2),xml2utf8))
> rownames(n2) <- N; colnames(n2) <- N
and, for example, visualize the network:
> col<- colorRampPalette(c("white", "black"))(4)
> heatmap(x=n2,col=col,symm=TRUE,cexRow=0.3,cexCol=0.3)
{{notes:pics:pajheat.png}}
or to get it in vector graphics (SVG)
> svg("PAJheat.svg",width=10,height=10)
> heatmap(x=n2,col=col,symm=TRUE,cexRow=0.5,cexCol=0.5)
> dev.off()
{{notes:pdf:pajheat.pdf}}
For details on matrix layouts se [[notes:net:rmat]].
==== Directly reading NET file into R ====
[[https://raw.githubusercontent.com/bavla/Rnet/master/data/BM.net|Bavla/Rnet/Data/BM.net]]
To read a NET file directly we can use the following code
> S <- readLines("BM.net")
> i <- grep("\\*vert",S,ignore.case=TRUE)
> n <- as.integer(unlist(strsplit(S[i]," "))[2])
> SV <- S[(i+1):(i+n)]; Encoding(SV) <- "UTF-8"
> L <- strsplit(SV,'\"')
> df <- data.frame(matrix(unlist(L),nrow=n,byrow=TRUE),stringsAsFactors=FALSE)
> ind <- as.integer(df$X1)
> N <- df$X2; Encoding(N) <- "UTF-8"
> trim <- function (x) gsub("^\\s+|\\s+$", "", x)
> xyz <- trim(df$X3)
> V <- data.frame(ind=ind,name=N,xyz=xyz)
> edges <- toupper(substr(S[i+n+1],1,2)) == "*E"
> m <- length(S)
> if(nchar(S[m])==0) m <- m-1
> L <- trim(S[(i+n+2):m]); m <- length(L)
> T <- strsplit(L,'[[:space:]]+')
> df <- data.frame(matrix(unlist(T),nrow=m,byrow=TRUE),stringsAsFactors=FALSE)
> df$X1 <- as.integer(df$X1); df$X2 <- as.integer(df$X2)
> df$X3 <- as.numeric(df$X3)
> W <- matrix(0,nrow=n,ncol=n)
> for(e in 1:m){
+ W[df$X1[e],df$X2[e]] <- W[df$X1[e],df$X2[e]] + df$X3[e]
+ if(edges) W[df$X2[e],df$X1[e]] <- W[df$X2[e],df$X1[e]] + df$X3[e]
+ }
> rownames(W) <- N; colnames(W) <- N
> dump("W","W.Rdata")
===== Reading clustering =====
[[notes:net:jocr#read_clustering_in_r]]