====== LPP - Ljubljanski potniški promet ====== ===== Structure of bus line web description ===== Looking at the source of the [[https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=N1|city bus line]] description ...

Linija N1, smer: BROD


Čas vožnje v minutah
Postajališče
we see that the data can be extracted using the XML library > wdir <- "C:/Users/vlado/DL/data/LJ/LPP" > setwd(wdir) > library(XML) > cat("LPP:",date(),"\n\n") > page <- 'https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=N1' > html <- readLines(con<-url(page)); close(con) > S <- html[nchar(html)>0] > Page <- htmlParse(S) > length(S) > post <- xpathSApply(Page,'//a[@class="stop"]',xmlAttrs)[1,] > lab <- trimws(xpathSApply(Page,'//a[@class="stop"]',xmlValue)) > T <- matrix(unlist(strsplit(post,"&")),ncol=2,byrow=TRUE) > L <- data.frame(lab=lab,stop=gsub("\\?stop=", "",T[,1]),ref=gsub("ref=", "",T[,2])) > head(L) lab stop ref 1 BAVARSKI DVOR (Kozolec) 600012-1 1387 2 Gosposvetska 700012-1 1387 3 Tivoli 801012-1 1387 4 Stara cerkev 802012-1 1387 5 Kino Šiška 802022-1 1387 6 Remiza 803014-1 1387 > From https://www.lpp.si/javni-prevoz/vozni-redi we get a list of bus lines (the data about the line 19B were not available) and after some editing # https://www.lpp.si/javni-prevoz/vozni-redi proga;naslov;ime;opomba 1;"https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=1";Stanežiče P+R - Dolgi most P+R;ob delavnikih in sobotah 1B;"https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=1B";Brod - Stanežiče P+R - Dolgi most P+R;ob nedeljah in praznikih N1;"https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=N1";Bavarski dvor - Stanežiče P+R - Brod;nočna - vse dni v letu 2;"https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=2";Zelena jama - Nove Jarše; 3;"https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=3";Litostroj - Rudnik; ... 19;"http://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?line=1382";Tomačevo - Barje; ... 60;"https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=60";Vodice - Ljubljana;ob delavnikih in sobotah 61;"https://www.lpp.si/sites/default/files/lpp_vozniredi/iskalnik/index.php?stop=0&l=61";Vodice-Polje-Vodice;ob delavnikih in sobotah saved to the file ''lines.csv''. ===== Extracting data about bus lines ===== First, we extract the interesting data from bus line web pages and collect it in the data frame LL. > cat("LPP:",date(),"\n\n") > line <- read.csv2("lines.csv",head=TRUE,skip=1) > line$proga <- trimws(line$proga) > LL <- NULL; line$koda <- NA > for(i in 1:nrow(line)){ + lineLab <- line$proga[i] + cat(i,"-",lineLab,":",line$ime[i],"\n"); flush.console() + page <- line$naslov[i] + html <- readLines(con<-url(page)); close(con) + S <- html[nchar(html)>0]; cat("length =",length(S)) + save <- file(paste("./DL/",lineLab,".html",sep="")) + writeLines(S,save); close(save) + Page <- htmlParse(S) + post <- xpathSApply(Page,'//a[@class="stop"]',xmlAttrs)[1,] + lab <- trimws(xpathSApply(Page,'//a[@class="stop"]',xmlValue)) + T <- matrix(unlist(strsplit(post,"&")),ncol=2,byrow=TRUE) + L <- data.frame(lab=lab,stop=gsub("\\?stop=", "",T[,1]),ref=gsub("ref=", "",T[,2])) + LL <- rbind(LL,L) + line$koda[i] <- L$ref[1]; cat(" koda =",L$ref[1],"\n") + } > write.csv2(line,"linesNew.csv") > write.csv2(LL,"LPP23.csv") > dim(LL) [1] 2269 3 > head(LL) lab stop ref 1 D. MOST P+R 704011-1 1373 2 Zgornji log 704141-1 1373 3 Cesta v Gorice 704151-1 1373 4 Mestni log 604031-1 1373 5 Tbilisijska 603141-1 1373 6 Koprska 603041-1 1373 > tail(LL) lab stop ref 2264 Koseze 826111-2 817 2265 Utik 826041-2 817 2266 Bukovica Anzel 826031-2 817 2267 Bukovica Plevevc 826021-2 817 2268 Vodice OŠ 826091-2 817 2269 VODICE 826011-2 817 LPP: Mon Sep 18 04:06:22 2023 1 - 1 : Stanežiče P+R - Dolgi most P+R length = 1439 koda = 1373 2 - 1B : Brod - Stanežiče P+R - Dolgi most P+R length = 1505 koda = 1362 3 - N1 : Bavarski dvor - Stanežiče P+R - Brod length = 1331 koda = 1387 4 - 2 : Zelena jama - Nove Jarše length = 1439 koda = 1203 5 - 3 : Litostroj - Rudnik length = 1349 koda = 1321 6 - 3B : Litostroj - Škofljica length = 1421 koda = 1374 7 - 3G : Bežigrad (Železna) - Grosuplje length = 1571 koda = 1303 8 - N3 : Bavarski dvor - Rudnik length = 1271 koda = 1223 9 - N3B : Bavarski dvor - Škofljica length = 1343 koda = 1224 10 - 5 : Podutik - Štepanjsko naselje length = 1391 koda = 1265 11 - N5 : Podutik - Bavarski dvor - Štepanjsko naselje length = 1391 koda = 1367 12 - 6 : Črnuče - Dolgi most (P+R) length = 1409 koda = 1324 13 - 6B : Bežigrad (Železna) - Notranje Gorice - Jezero length = 1409 koda = 1325 14 - 7 : Pržan - Nove Jarše length = 1385 koda = 1326 15 - 7L : Pržan - Letališka length = 1427 koda = 1327 16 - 8 : Brnčičeva - Brod length = 1511 koda = 1375 17 - 9 : Barje P+R - Štepanjsko naselje length = 1331 koda = 1269 18 - 10 : Zadobrova - Podgrad length = 1319 koda = 1376 19 - 11 : Ježica P+R - Zalog length = 1553 koda = 1330 20 - 11B : Bežigrad (Železna) - Zalog length = 1451 koda = 1331 21 - 12 : Bežigrad (Železna) - Vevče length = 1325 koda = 1377 22 - 12D : Bežigrad (Železna) - Dragomelj length = 1289 koda = 1381 23 - 13 : Center Stožice P+R - Sostro length = 1409 koda = 1274 24 - 14 : Savlje - Bokalce length = 1451 koda = 1335 25 - 15 : Stanežiče - Medvode - Sora length = 1409 koda = 1336 26 - 16 : Črni log - Trnovo length = 1253 koda = 1302 27 - 18 : Kolodvor - ZOO - Center Stožice length = 1421 koda = 1337 28 - 18L : Kolodvor - ZOO - Litostrojska length = 1235 koda = 1338 29 - 19 : Tomačevo - Barje length = 1391 koda = 1382 30 - 19I : Tomačevo - Iška vas obračališče length = 1511 koda = 1383 31 - 19Z : Jezero - Notranje Gorice length = 1133 koda = 1257 32 - 20 : Nove Stožice - Fužine length = 1349 koda = 1341 33 - 20Z : Nove Stožice Zalog - preko Fužin length = 1481 koda = 1342 34 - 21 : Beričevo - Ježica length = 1277 koda = 1384 35 - 21Z : Gameljne - Vižmarje length = 1151 koda = 1385 36 - 22 : Kamna Gorica - Fužine length = 1373 koda = 1344 37 - 23 : Podutik - Kamna Gorica length = 1193 koda = 976 38 - 24 : BTC Atlantis - Vevče length = 1337 koda = 1363 39 - 25 : Medvode - Zadobrova length = 1523 koda = 1345 40 - 26 : Mali Lipoglav - Tuji grm length = 1541 koda = 846 41 - 27 : Letališka - BTC - NS Rudnik length = 1457 koda = 1386 42 - 30 : Medvode - Vodice length = 1247 koda = 1139 43 - 51 : Ljubljana-Dobrova-Polhov Gradec length = 1403 koda = 963 44 - 52 : Polhov Gradec-Črni vrh length = 1211 koda = 646 45 - 53 : Polhov Gradec-Suhi dol length = 1199 koda = 844 46 - 56 : Ljubljana-Dobrova-Vrzdenec-Šentjošt length = 1439 koda = 648 47 - 60 : Vodice - Ljubljana length = 1439 koda = 704 48 - 61 : Vodice-Polje-Vodice length = 1289 koda = 817 ===== Constructing the LPP23 network ===== From the data frame LL we construct a multi-relational network LPP23. Each bus line determines a relation. > K <- factor(line$koda,levels=line$koda) > lineLab[as.integer(K)] <- line$proga > LL$rel <- as.integer(factor(LL$ref,levels=levels(K))) > LL$node <- as.integer(factor(LL$stop)) > LL$last <- as.integer(substr(LL$stop,nchar(LL$stop),nchar(LL$stop))) > # LL$main <- as.integer(substr(LL$stop,1,nchar(LL$stop)-2)) > # LL$p <- as.integer(factor(LL$main)) > LL$lab[1280] <- LL$lab[1285] <- "Rakovnik S" > LL$lab[2199] <- LL$lab[2216] <- LL$lab[2242] <- LL$lab[2264] <- "Koseze V" > > n <- max(LL$node) > nodeNam <- nodeCode <- nodeClu <- rep(NA,n) > clu <- file("LPP23.clu","w"); cat("*vertices",n,"\n",file=clu) > net <- file("LPP23.net","w"); nam <- file("LPP23.nam","w") > cat("*vertices",n,"\n",file=net); cat("*vertices",n,"\n",file=nam) > for(i in 1:nrow(LL)){ + v <- LL$node[i] + if(is.na(nodeNam[v])){nodeNam[v] <- LL$lab[i]; nodeCode[v] <- LL$stop[i]} + } > p <- as.integer(factor(nodeNam)) > for(v in 1:n){ + cat(v,' "',nodeCode[v],'"\n',sep="",file=nam) + cat(v,' "',nodeNam[v],'"\n',sep="",file=net) + cat(p[v],'\n',sep="",file=clu) + } > r <- 0 > for(i in 1:nrow(LL)){ + if(LL$rel[i] != r){ + if(r>0) cat(v,s,'\n',file=net) + r <- LL$rel[i]; rLab <- lineLab[r]; s <- v <- LL$node[i] + cat("*arcs :",r,' "',rLab,'"\n',sep="",file=net) + } else { + u <- v; v <- LL$node[i]; cat(u,v,'\n',file=net) + } + } > cat(v,s,'\n',file=net) > close(net); close(nam); close(clu) > > max(p) [1] 531 In Pajek I simplified the network LPP23.net (shrinking of nodes according to the partition LP23.clu and replacing pairs of opposite arcs with edges): Operations/Network+Partition/Shrinking [OK] Network/Create new network/Transform/Arcs -> Edges/Bidirected only/Min I saved the simplified network to the file LPP23edge.net. I manually improved the layout of the network and saved it to LPP23pic.net. I exported the picture in SVG format. Since I found the loops disturbing I created a partition counting the number of loops in a node (0-cyan, 1-yellow, 2-green, 3-red) and removed loops from the visualized network. Network/Create vector/Get loops Vector/Make partition/Truncating Network/Create new network/Transform/Remove/Loops Draw/Network+First partition Export/2D/SVG/Multiple relations network [[http://vlado.fmf.uni-lj.si/Pub/networks/vis/LPP23m.svg|SVG picture of LPP23]].\\ [[https://www.lpp.si/sites/www.jhl.si/files/dokumenti/shema_dnevnih_linij_lpp_september_2023.pdf|Map of daily bus lines September_2023]]. ===== Data cleaning ===== It turned out that there exist different stops with the same name (Rakovnik, Koseze). I renamed selected stops and rebuilt the network. > ir <- which(LL$lab=="Rakovnik") > LL[ir,] lab stop ref rel relLab node last main p 241 Rakovnik 503022-1 1321 5 3 594 1 503022 425 253 Rakovnik 503021-2 1321 5 3 593 2 503021 424 284 Rakovnik 503022-1 1374 6 3B 594 1 503022 425 308 Rakovnik 503021-2 1374 6 3B 593 2 503021 424 334 Rakovnik 503022-1 1303 7 3G 594 1 503022 425 394 Rakovnik 503021-2 1303 7 3G 593 2 503021 424 412 Rakovnik 503022-1 1223 8 N3 594 1 503022 425 424 Rakovnik 503021-2 1223 8 N3 593 2 503021 424 442 Rakovnik 503022-1 1224 9 N3B 594 1 503022 425 466 Rakovnik 503021-2 1224 9 N3B 593 2 503021 424 1280 Rakovnik 815212-1 1336 25 15 1204 1 815212 920 1285 Rakovnik 815211-2 1336 25 15 1203 2 815211 919 > LL$lab[1280] <- LL$lab[1285] <- "Rakovnik S" > ir <- which(LL$lab=="Koseze") > LL[ir,] lab stop ref rel relLab node last main p 481 Koseze 803141-1 1265 10 5 1059 1 803141 806 521 Koseze 803142-2 1265 10 5 1062 2 803142 807 531 Koseze 803141-1 1367 11 N5 1059 1 803141 806 571 Koseze 803142-2 1367 11 N5 1062 2 803142 807 1712 Koseze 803142-1 1344 36 22 1061 1 803142 807 1719 Koseze 803141-2 1344 36 22 1060 2 803141 806 2199 Koseze 826112-1 704 47 60 1253 1 826112 964 2216 Koseze 826111-2 704 47 60 1252 2 826111 963 2242 Koseze 826112-1 817 48 61 1253 1 826112 964 2264 Koseze 826111-2 817 48 61 1252 2 826111 963 > LL$lab[2199] <- LL$lab[2216] <- LL$lab[2242] <- LL$lab[2264] <- "Koseze V" Maybe some stops should be merged (Polje obračališče / POLJE obr.)?