R functions for reading and writing Pajek files without calling another networks package


Many users who perform network analysis in Pajek simply want to write input files and read output files without calling another networks package. Although several networks packages support reading Pajek files, fewer support writing them. And because the packages assume you want to manipulate the network further using them, they add several more layers of meta-information relevant for the package when reading in the file. The result is slow processing time to simply move files in and out of R.


The script RPajekFunctions.R writes six functions for reading and writing network files (read_net and write_net), partition files (read_clu and write_clu), and vector files (read_vec and write_vec). The functions are written in Base R to avoid package dependency issues.

The logic behind this script is that the user calls the functions using a source statement (e.g., source("/Applications/Pajek64/RPajekFunctions.R"). The functions are written and lists are created that enumerate all the .net, .clu, and .vec files in your work directory. The user can then simply refer to the lists, and avoid writing directory path information.
The read_net function supports reading Pajek ID, label, coordinate, color, and shape information. The write_net function supports writing Pajek .net files with coordinates information and color information. The files will run if blank values are read into the function. The read_clu and read_vec functions read all .clu and .vec files respectively in your work directory, sorts them by size, and outputs data.frames consisting of partition or vector files of the same size. The file names are retained in the column names of the data.frames. The write_clu and write_vec functions write .clu and .vec files.

The script Pajek Functions_User Script_5May2021.R provides a series of use cases and examples. There are examples of reading and writing single files and multiple files. These scripts were tested on PC and OS X machines on medium and large networks, although I have come nowhere near in my tests to Pajek's full range. I expect for networks at Pajek's top end, replacing base R's readLines function with a more efficient reader would be helpful. But, I thought the compatibility issues were worth the tradeoff.


Jonathan H. Morgan, Ph.D.
Fachhochschule Potsdam | University of Applied Sciences
IaF Urbane Zukunft | Institute for Urban Futures
Kiepenheuerallee 5, Haus 4, Raum 3.14
14469 Potsdam, Germany