Interest in social network analysis has exploded in the past few years, partly thanks to the advancements in statistical methods and computing for network analysis. A wide range of the methods for network analysis is already covered by existent R packages. However, no comprehensive packages are available to calculate group centrality scores and to identify key players (i.e., those players who constitute the most central group) in a network. These functionalities are important because, for example, many social and health interventions rely on key players to facilitate the intervention. Identifying key players is challenging because players who are individually the most central are not necessarily the most central as a group due to redundancy in their connections. In this paper we develop methods and tools for computing group centrality scores and for identifying key players in social networks. We illustrate the methods using both simulated and empirical examples. The package *keyplayer* providing the presented methods is available from Comprehensive R Archive Network (CRAN).

Interest in social network analysis has grown rapidly in the past few
years. This was due partly to the advancements in statistical methods
and computing for network analysis and partly to the increasing
availability of social network data (e.g., network data generated by
social media). A wide range of the methods for network analysis is
already covered by R packages such as
*network* (Butts 2008a),
*sna* (Butts 2008b),
*igraph* (Csardi and T. Nepusz 2006),
*statnet* (Handcock et al. 2008),
*RSiena* (Ripley et al. 2013), etc.
However, none of these packages provides a comprehensive toolbox to
calculate group centrality measures and to identify key players, who
constitute the most central group, in a network. Determining the key
players in a network is very important because many social and health
interventions rely on key players to facilitate the intervention. For
example, (Kelly et al. 1991) and (Latkin 1998) trained peer leaders as educators to
promote HIV prevention. (Campbell et al. 2008) and (An 2015) used peer leaders to
facilitate smoking prevention. (Borgatti 2006) and (Ressler 2006)
suggested removing key figures among terrorists to most widely disrupt
terrorism. More examples of this sort can be found in (Valente and Pumpuang 2007),
(Banerjee et al. 2013), etc. Identifying key players is challenging because
players who are individually the most central are not necessarily the
most central as a group due to redundancy in their connections. In a
seminal paper, (Borgatti 2006) pointed out the problem and proposed
methods for identifying key players in social networks.

To the best of our knowledge, the `keyplayer`

function in *UCINET*
(Borgatti et al. 2002) is the first implementation of the methods detailed in
(Borgatti 2006). It has evolved from a separate add-on to *UCINET* to a
built-in function *UCINET*. In this paper, we present the
*keyplayer* package
(An and Liu 2016) in R, which differs from the `keyplayer`

function in
*UCINET* in several aspects. (1) Unlike the `keyplayer`

function in
*UCINET* which is only applicable to binary networks, *keyplayer* in R
can be used for both binary and weighted networks. (2) The *keyplayer*
package includes more centrality measures for choosing key players than
what is currently available in the `keyplayer`

function in *UCINET*. (3)
*keyplayer* provides better integration with other open-source packages
in R. Overall, the `keyplayer`

function in *UCINET* is useful for
researchers who are more familiar with *UCINET* and would like to
utilize other functionalities provided by *UCINET*, whereas *keyplayer*
is designed for users who are more familiar with R and who plan to do
more computational work.

The *influenceR*
package (Simon and Aditya 2015) aims to provide calculations of several node
centrality measures that were previously unavailable in other packages,
such as the constraint index (Burt 1992) and the bridging score
(Valente and Fujimoto 2010). It can also be used to identify key players in a network.
But in comparison to *keyplayer*, it utilizes only one centrality metric
when selecting key players whereas *keyplayer* includes eight different
metrics. Also, *influenceR* currently works only for undirected networks
whereas *keyplayer* works for both undirected and directed networks.
Both packages provide parallel computation. *influenceR* relies on
OpenMP for parallel computation whereas *keyplayer* utilizes the base
package *parallel* which is readily available in R. Last, *influenceR*
focuses on computing centrality measures at the node level whereas
*keyplayer* is more interested in providing centrality measures at the
group level. Overall, *keyplayer* provides more comprehensive
functionalities for calculating group centrality measures and for
selecting key players.

The algorithm for identifying key players in package *keyplayer*
essentially consists of three steps. First, users choose a metric to
measure centrality in a network. Second, the algorithm (specifically the
`kpcent`

function) will randomly pick a group of players and measure
their group centrality. Third, the algorithm (specifically the `kpset`

function) will select the group of players with the highest group
centrality as the desired key players. In general, users only need to
employ the `kpset`

function by specifying a centrality metric and the
number of key players to be selected. The function will return a set of
players who are the most central as a group. We also make the auxiliary
function `kpcent`

available. If users specify a centrality metric and
the indices of a group of players, this function will return the
centrality score of the specified group. Thus the two functions can be
used for two purposes: selecting key players or measuring group
centrality.

The paper proceeds as follows. First, we review centrality measures at
the individual level. Then we present methods for measuring centrality
at the group level. After that, we present a greedy search algorithm for
selecting key players and outline the basic structure and the usage of
the main function `kpset`

in package *keyplayer*. To illustrate the
methods and the usage of the package, we use a simulated network as well
as an empirical example based on the friendship network among managers
in a company. Last, we summarize and point out directions for improving
the package in the future.

We first review the definitions of centrality measures at the individual level. For conciseness, we provide the definitions based on weighted networks, where the weight of a tie takes a continuous value and usually measures the strength of the connection between two nodes. The definitions naturally incorporate binary networks where the weight of a tie can only be one or zero, indicating the presence or absence of a connection (Freeman 1978; Wasserman and Faust 1994; Butts 2008b).

Figure 1 shows an example of a simulated network. On the
left is the adjacency matrix of the network. On the right is the network
graph. Thinking of it as a friendship network, we can see that the
strength of friendship between node 1 and node 3 is conceived
differently by node 1 and node 3. The former assigns it a weight of 3
while the latter assigns it a weight of 1. We will use this example to
illustrate the centrality measures. Calculations of four centrality
measures (i.e., degree, closeness, betweenness, and eigenvector
centralities) at the individual level are done using the *sna* package
(Butts 2008b). Calculations of four other individual level centralities and
all group level centralities are done using our package *keyplayer*. We
would like to clarify at this point that our package does not depend on
*sna*. We use *sna* here just for the sake of the example.

\[W=\begin{bmatrix} 0 & 1 & 3 & 0 & 0 \\ 0 & 0 & 0 & 4 & 0 \\ 1 & 1 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 3 \\ 0 & 2 & 0 & 0 & 0 \\ \end{bmatrix}\] |