




Combining JCR networks into a single network for each year

Combine networks from https://www.leydesdorff.net/jcr07/cited/ into a single network for each year.

At the first sight it seems that the files are numbered with consecutive numbers. But it turned out that this assumption doesn't hold.

<b>Cited 2007:</b><br>
I cut-out from the html description of the page https://www.leydesdorff.net/jcr07/cited/ the list of all files and saved it in the file list07.txt. From it I extracted in R the corresponding file names.

> wdir <- "C:/Users/vlado/work2/data/nets/loet"
> setwd(wdir)
> source("Pajek.R")
> # https://www.leydesdorff.net/jcr07/cited/v4.txt
> cat("Start:",format(Sys.time(), "%H:%M:%S"),"\n") 
> durl <- "https://www.leydesdorff.net/jcr07/cited/"
> L <- readLines("list07.txt"); L <- L[tolower(substr(L,1,2))=="<a"]
> S <- unlist(strsplit(L,'"')); F <- tolower(S[3*(1:length(L))-1])
> m <- 0; mmax <- 800000; ner <- 0
> U <- rep(NA,mmax); V <- rep(NA,mmax); W <- rep(NA,mmax)
> N <- c(); N["§{@@@@@@@@}§"] <- 0; j <- 0
> for(f in F){
+   j <- j+1
+   if(j %% 100==0) {cat(j,":",format(Sys.time(), "%H:%M:%S"),"\n"); flush.console()}
+   page <- paste(durl,f,sep='')
+   M <- net2matrix(page,warn=-1)
+   if(is.na(M)){ner <- ner+1
+     cat("\n",j,"error in file",page,"\n"); flush.console(); next
+   }
+   nn <- nrow(M); Nam <- row.names(M)
+   for(u in 1:nn) for(v in u:nn){
+     if(M[u,v]!=0){
+       indu <- N[Nam[u]]; if(is.na(indu)) indu <- N[Nam[u]] <- length(N)
+       indv <- N[Nam[v]]; if(is.na(indv)) indv <- N[Nam[v]] <- length(N)
+       m <- m+1; U[m] <- indu; V[m] <- indv; W[m] <- M[u,v]
+     }
+   }
+ }
> cat("end:",format(Sys.time(), "%H:%M:%S"),"\n",ner," errors\n")
> uvLab2net(names(N)[2:length(N)],U[1:m],V[1:m],W[1:m],Net="JCR07.net")

The program was interrupted

Error in file(file, "r") : cannot open the connection

To continue the network construction in the interruption point I entered a slightly changed part of the program (valid also for following interruptions)

> cat("Start:",format(Sys.time(), "%H:%M:%S"),"\n")
> F1 <- F[j:length(F)]; j <- j-1
> for(f in F1){
+   j <- j+1
+   }
+ }
> cat("end:",format(Sys.time(), "%H:%M:%S"),"\n",ner," errors\n")

The obtained network JCR*.net has many multiple links with different weights. In Pajek I transform it into the corresponding simple network JCR*s.net with the maximum weight on each link.

Network    Nodes    LinksC     LinksS     AvDegree
JCR4        7251    463056     109036       30.075
JCR5        7397    549421     120835       32.671
JCR6        7487    543443     119713       31.979
JCR7        7769    552335     124208       31.975

LinksC is the number of links in the network JCR*.net and LinksS is the number of links in the network JCR*s.net. AvDegree is the average degree in the network JCR*s.net.

The Pajek NET files are available at Github/Bavla.