====== R functions for Pajek's networks ====== [[notes:clu:cluster|Structure of clustering data in R]] ===== Reading Pajek's net file ===== July 24, 2017 See also https://github.com/bavla/Rnet/tree/master/R ==== Using Pajek ==== The simplest way to transfer a Pajek's network into R is to use Pajek's command Tools/R/Send to R/Current Network In R we can use also the following commands: Use objects() to get list of available objects Use comment(?) to get information about selected object Use savevector(v?,'???.vec') to save vector to Pajek input file Use savematrix(n?,'???.net') to save matrix to Pajek input file (.MAT) savematrix(n?,'???.net',2) to request a 2-mode matrix (.MAT) Use savenetwork(n?,'???.net') to save matrix to Pajek input file (.NET) savenetwork(n?,'???.net',2) to request a 2-mode network (.NET) Use v?<-loadvector('???.vec') to load vector(s) from Pajek input file Use n?<-loadmatrix('???.mat') to load matrix from Pajek input file There are problems with this approach if the node labels contain Unicode characters. In Pajek they are represented by XML entities. For example the label ''МАКАГОНОВА Н'' is represented as ''МАКАГОНОВАН''. To convert them in R into Unicode we can use the library ''xml2'' > library(xml2) > xml2utf8 <- function(str){ + t <- xml2::xml_text(xml2::read_xml(paste0("", str, ""))) + Encoding(t) <- "UTF-8" + return (t) + } Now we can "correct" the network description (suppose that the network is stored in n2 with values from {0,1,2,3}) > N <- unlist(lapply(rownames(n2),xml2utf8)) > rownames(n2) <- N; colnames(n2) <- N and, for example, visualize the network: > col<- colorRampPalette(c("white", "black"))(4) > heatmap(x=n2,col=col,symm=TRUE,cexRow=0.3,cexCol=0.3) {{notes:pics:pajheat.png}} or to get it in vector graphics (SVG) > svg("PAJheat.svg",width=10,height=10) > heatmap(x=n2,col=col,symm=TRUE,cexRow=0.5,cexCol=0.5) > dev.off() {{notes:pdf:pajheat.pdf}} For details on matrix layouts se [[notes:net:rmat]]. ==== Directly reading NET file into R ==== [[https://raw.githubusercontent.com/bavla/Rnet/master/data/BM.net|Bavla/Rnet/Data/BM.net]] To read a NET file directly we can use the following code > S <- readLines("BM.net") > i <- grep("\\*vert",S,ignore.case=TRUE) > n <- as.integer(unlist(strsplit(S[i]," "))[2]) > SV <- S[(i+1):(i+n)]; Encoding(SV) <- "UTF-8" > L <- strsplit(SV,'\"') > df <- data.frame(matrix(unlist(L),nrow=n,byrow=TRUE),stringsAsFactors=FALSE) > ind <- as.integer(df$X1) > N <- df$X2; Encoding(N) <- "UTF-8" > trim <- function (x) gsub("^\\s+|\\s+$", "", x) > xyz <- trim(df$X3) > V <- data.frame(ind=ind,name=N,xyz=xyz) > edges <- toupper(substr(S[i+n+1],1,2)) == "*E" > m <- length(S) > if(nchar(S[m])==0) m <- m-1 > L <- trim(S[(i+n+2):m]); m <- length(L) > T <- strsplit(L,'[[:space:]]+') > df <- data.frame(matrix(unlist(T),nrow=m,byrow=TRUE),stringsAsFactors=FALSE) > df$X1 <- as.integer(df$X1); df$X2 <- as.integer(df$X2) > df$X3 <- as.numeric(df$X3) > W <- matrix(0,nrow=n,ncol=n) > for(e in 1:m){ + W[df$X1[e],df$X2[e]] <- W[df$X1[e],df$X2[e]] + df$X3[e] + if(edges) W[df$X2[e],df$X1[e]] <- W[df$X2[e],df$X1[e]] + df$X3[e] + } > rownames(W) <- N; colnames(W) <- N > dump("W","W.Rdata") ===== Reading clustering ===== [[notes:net:jocr#read_clustering_in_r]]