1214. sredin seminar, 20. junij 2012

Gomez Nuñez, Antonio Jesus antoniojesus.gomez@cchs.csic.es

Improving the Categorization of Scopus Journals included in SCImago Journal & Country Rank (SJR)

Scientific information stored in large scientific multidisciplinary databases requires a good organization and arrangement not only for information retrieval purposes but for developing reliable and non-misleading indicators about impact, collaboration, visibility…, within disciplines like Bibliometrics and Scientometrics. Similarly, a good classification of information, regardless of aggregation level, is desirable for information visualization or network analysis, whose main surveys are based on information covered by scientific databases. Among the most prestigious and remarkable ones are Web of Knowledge (WOK) [Thomson Reuters] and Scopus [Elsevier]. Both use a similar classification scheme according to a hierarchical system in two levels composed of areas (broad level) and categories (specific level).

Among different tools for analysis and assessment of scientific information “SCImago Journal & Country Rank is a portal that includes the journals and country scientific indicators developed from the information contained in the Scopus® database (Elsevier B.V.). These indicators can be used to assess and analyze scientific domains” (Scimago Lab. http://www.scimagojr.com/). Starting from the previous classification of journals produced by Scopus, the categorization of journals was refined following different criteria like opinion of experts, tiles and scopes of journals.

Hereupon, it was pretended to improve and to tune the categorization of the SJR journal set using automatic and statistical procedures, or at least, avoiding the human mediation so far as possible. Thus, a first work to improve the classification scheme of SJR working from initial categorization and using reference analysis in combination with different citation thresholds to determine the final category of every journal was implemented. This method showed a solid performance in grouping journals at a level higher than categories —that is, aggregating journals into subject areas. It also enabled us to redesign the SJR classification scheme, providing for a more cohesive one that covers a good proportion of re-categorized journals. Anyhow, in order to obtain a better categorization of journals, the method should be complemented with additional techniques.

For following work, it was decided to make clustering of journals using a combination of three citation measures, namely, Direct Citation (DC), Cocitation (CC) and Bibliographic Coupling (BC). Using R statistical software, an asymmetrical journal-journal matrix with the sum of fractionalized 3-citation-measures was constructed and then, values were transformed into cosine similarities. In closing, similarities values were transformed into distances and Ward hierarchical clustering was applied on them.

The proposal to develop in Ljubljana is related to the use of software Pajek and, concretely, island analysis to detect different sub-networks (clusters) from the global journal citation network formed by around 19000 Scopus journals.

For the future research, it seems interesting to employ new statistical/automatic techniques or network analysis adopting a combination of different variables like citation measures, text of documents (title, abstract and/or keywords), or address of authors, etc.

Sreda/wiki