TQ User guide

September 11th Reuters terror news

The Reuters terror news network was obtained from the CRA (Centering Resonance Analysis) networks produced by Steve Corman and Kevin Dooley at Arizona State University. The network is based on all the stories released during 66 consecutive days by the news agency Reuters concerning the September 11 attack on the U.S., beginning at 9:00 AM EST 9/11/01. The nodes of this network are important words (terms). There is an edge between two words iff they appear in the same utterance (for details see the paper \cite{CRA}). The weight of an edge is its frequency. The network has n = 13332 nodes (different words in the news) and m = 243447 edges, 50859 with value larger than 1. There are no loops in the network.

The Reuters terror news network was used as a case network for the Viszards visualization session on the Sunbelt XXII International Sunbelt Social Network Conference, New Orleans, USA, 13-17. February 2002.

We transformed the Pajek version of the network into the Ianus format used in TQ. To identify important terms we computed their aggregated frequencies and extracted the subnetwork of the 50 most frequently used (during 66 days) nodes. They are listed in the following table:

50 most frequent terms in the Terror news network.

 n   term                ∑freq            n   term            ∑freq  
 1   united_states       15000	  	 26   terrorism        2212   
 2   attack              10348	  	 27   day              2128   
 3   taliban              6266	  	 28   week             2017   
 4   people               5286	  	 29   worker           1983   
 5   afghanistan          5176	  	 30   office           1967   
 6   bin_laden            4885	  	 31   group            1966   
 7   new_york             4832	  	 32   air              1962   
 8   pres_bush            4506	  	 33   minister         1919   
 9   washington           4047	  	 34   time             1898   
10   official             3902	  	 35   hijack           1884   
11   anthrax              3563	  	 36   strike           1818   
12   military             3394	  	 37   afghan           1775   
13   plane                3078	  	 38   flight           1775   
14   world_trade_ctr      3006	  	 39   tell             1746   
15   security             2906	  	 40   terrorist        1745   
16   american             2825	  	 41   airport          1741   
17   country              2794	  	 42   pakistan         1714   
18   city                 2689	  	 43   tower            1685   
19   war                  2679	  	 44   bomb             1674   
20   tuesday              2635	  	 45   new              1650   
21   pentagon             2620	  	 46   buildng          1634   
22   force                2516	  	 47   wednesday        1593   
23   government           2380	  	 48   nation           1589   
24   leader               2375	  	 49   police           1587   
25   world                2213	  	 50   foreign          1558   

Trying to draw this subnetwork it turns out to be almost a complete graph. To obtain something readable we removed all temporal edges with a value smaller than 10. The corresponding underlying graph is presented in the following figure. The isolated nodes were removed.

September 11th. Subnetwork of the most frequently used terms.

For each of the 50 nodes we determined its temporal activity and drew it. By visual inspection we identified 6 typical activity patterns – types of terms. For all charts in the figure the displayed values are in the interval [0,200] - the largest activity value for the term Wednesday is larger than 200.

The primary terms are the terms with a very high frequency of appearance in the first week after September 11th and smaller, slowly declining values in the following period. The representative of this group in the figure is hijack and other members are: airport, american, attack, city, day, flight, nation, New York, official, Pentagon, people, plane, police, president Bush, security, tower, United States, Washington, world, World Trade center. These are the terms describing the event.

The secondary terms are a reaction to the event. There are no big changes in their values. We identified three subgroups:

a) slowly declining represented with bin Laden (country, foreign, government, military, minister, new, Pakistan, tell, terrorism, terrorist, time, war, week);

b) stationary represented with taliban (afghan, Afghanistan, force, group, leader); and

c) occasional with several peaks, represented with bomb (air, building, office, strike, worker).

There are three special patterns - two periodic Wednesday and Tuesday; and one episodic anthrax.

hijack
bin Laden
taliban
bomb
Wednesday
anthrax

Types of activity.

To consider in a measure of importance of the node u ∈ V also the node's position in the network we constructed the attraction coefficient att(u).

Let A = [ auv] be a network matrix of temporal quantities with positive real values. We define the node activity act(u) as (see Section~\ref{activ})

act(u) = act({u}, V\{u}) = ∑v∈V\{u} auv .

Then the attraction of the node u is defined as

att(u) = 1/Δ ∑v∈V\{u} avu / act(v) .

Note that the fraction auv / act(v) is measuring the proportion of the activity of the node v that is shared with the node u.

From 0 ≤ avu / act(v) ≤ 1 and deg(v)=0 ⇒ avu=0 it follows that

v∈V\{u} avu / act(v) ≤ deg(u) ≤ Δ

where Δ denotes the maximum degree. Therefore we have 0 ≤ att(u) ≤ 1, for all u∈V.

The maximum possible attraction value 1 is attained exactly for nodes: a) in an undirected network: that are the root of a star; b) in a directed network: that are the only out-neighbors of their in-neighbors – the root of a directed in-star.

We computed the temporal attraction and the corresponding aggregated attraction values for all the nodes in our network. We selected 30 nodes with the largest aggregated attraction values. They are listed in the following table:

30 most attractive terms in the Terror news network.

 n   term                   ∑att              n   term                   ∑att  
 1   united_states        12.216	     16   war                   2.758    
 2   taliban               7.096	     17   force                 2.596    
 3   attack                7.070	     18   new_york              2.590    
 4   afghanistan           5.142	     19   government            2.496    
 5   people                5.023	     20   day                   2.338    
 6   bin_laden             4.660	     21   leader                2.305    
 7   anthrax               4.601	     22   terrorism             2.202    
 8   pres_bush             4.374	     23   time                  2.182    
 9   country               3.317	     24   group                 2.072    
10   washington            3.067	     25   afghan                2.040    
11   security              2.939	     26   world                 1.995    
12   american              2.922	     27   week                  1.961    
13   official              2.831	     28   pakistan              1.943    
14   city                  2.798	     29   letter                1.866    
15   military              2.793	     30   new                   1.851    

Again we visually explored them. In the following figure we present temporal attraction coefficients for the 6 selected terms. For all charts in the figure the displayed attraction values are in the interval [0,0.2].

pres Bush
Pakistan
taliban
Kabul
bomb
anthrax

Attraction patterns. Comparing on the common terms (taliban, bomb, anthrax) the activity charts in the previous figure with the corresponding attraction charts in this figure we see that they are “correlated” (obviously act(a;t) = 0 implies att(a;t) = 0), but different in details.

For example, the terms taliban and bomb have small attraction values at the beginning of the time window – the terms were disguised by the primary terms. On the other hand, the terms taliban and Kabul get increased attraction towards the end of the time window.

In preparation. Not finished!!!


TQ User guide <<< >>>

tq/ug/11.txt · Last modified: 2016/04/27 19:27 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki