# Riots and subways, a relationship moderated by neighborhood income level

### Accessibility

We present an accessibility measure to determine how easily it is possible to get around a city using its public transport network. Similar measures have already been proposed^{20,21,22}; however, we aim to define a flexible and general formulation that can be applied to any network. This formulation helps to understand how accessibility is distributed in the space of a city and how it can be affected by changes in the transport network. The notion of *wide area network* is introduced based on the area around each network node (i.e. a metro station) accessible to users. This area relates to the distance people are willing to walk to use public transport.

The proportion of this extended network accessible to a user by using only one means of transport, taking into account the geographical position, is what is called *accessibility*. This measure is associated with the notion of potential accessibility^{23} because it depends on the location of the network nodes and how they are interconnected. The result is a two-dimensional distribution showing accessibility within a city.

To implement the reachability measure, we assumed a network composed of ({n}_{b}) lines. The location of each station in the city and its extended line will be stored on independent tables of size ({n}_{x}times {n}_{y}). The array stores metro and wide area networks, called (b) and (e), respectively. Each element belonging to these arrays will have a value either (0) Where (1). If the line (k) ((k=1,points ,{n}_{b})) has a station at position ((i,j)) then the value of the array element belonging to the row at that position will have a value of (1) ((0) Otherwise). It is written as ({b}_{i,j}^{(k)}=1). Associated with this metro line, the extended network is defined as follows: if ({b}_{i,j}^{(k)}=1) then the elements of the extended network linked to the (k) line ({e}^{(k)}) will have the values of (1) at office ((i,j)) and its surroundings. Therefore, it follows that for the (k) line and ((i,j)) position,

$${e}_{i,j}^{(k)}={e}_{ipm 1,j}^{(k)}={e}_{i,jpm 1}^ {(k)}={e}_{ipm 1,jpm 1}^{(k)}=1$$

(1)

The square-shaped pattern generated by this extended network was used because of its simplicity and ease of implementation.

Using the previous definitions, the entire WAN can be written as follows:

$$R_{{i,j}} = Pleft[ {sumlimits_{{k = 1}}^{{n_{b} }} {e_{{i,j}}^{{(k)}} } } right]$$

(2)

where (P) is an operator acting on each element of the array ({sum }_{k=1}^{{n}_{b}}{e}_{i,j}^{(k)})with the following definition:

$$Pleft[aright]=left{begin{array}{c}1 if a>0 0 if a le 0end{array}right.$$

(3)

This operator is used to normalize the size of each element in ({R}_{i,j}) and avoid double counting where WANs overlap. With the use of eq. (2), we can define the total size of the extended network as the sum of all its elements as follows:

$$N_{t} = sumlimits_{{i = 1}}^{{n_{x} }} {sumlimits_{{j = 1}}^{{n_{y} }} {R_ {{i,j}} } }$$

(4)

Now the reachable region of the plane for each particular position given by the pair ((i,j)) can be defined as the array:

$$f_{{l,m}}^{{(i,j)}} = Pleft[ {sumlimits_{{k = 1}}^{{n_{b} }} {e_{{i,j}}^{{(k)}} e_{{l,m}}^{{(k)}} } } right]$$

(5)

The product of the (e) arrays implies that the considered expanded rows have nonzero elements at ((i,j)) to avoid multiple counts at overlapping points.

Finally, we can calculate the numerical value of the reachability for the position ((i,j)) as the fraction of the total WAN that can be reached from a specific location:

$$A^{{(i,j)}} = frac{1}{{N_{t} }}sumlimits_{{l = 1}}^{{n_{x} }} {sum limits_{{m = 1}}^{{n_{y} }} {f_{{l,m}}^{{(i,j)}} } }$$

(6)

### Empirical specification

We use regression techniques to explore the heterogeneity of the association between proximity to the metro network and riot intensity across neighborhood income levels. Other methods, such as propensity score matching, are less likely to describe associations when the key variable (in this case, distance to the metro system) is continuous and are also less prone to heterogeneity analyzes . We measure the distance between the coordinates of each incident and its nearest metro station as the spherical distance between them (as the crow flies).

We describe a bivariate relationship between distance to the metro network and rioting as follows,

$${Y}_{i}={beta }_{0}+{beta }_{1}{d}_{i}+{varepsilon }_{i}$$

(seven)

where ({Y}_{i}) is the natural logarithm of the number of reported riots in the grid cell (I), ({d}_{i}) is the distance from the centroid of the grid cell to the nearest metro station (our key variable), and ({varepsilon}_{i}) is the error term.

Due to potential omitted variables that could skew a conservative association between metro proximity and riot intensity, we include covariates such as neighborhood income level and education variables, dimensions identified in previous research as determinants of riots^{8.24}. Therefore, the following equation describes the empirical specification, including the covariates:

$${Y}_{i}={beta }_{0}+{beta }_{1}{d}_{i}+{{varvec{i}}{varvec{n}} {varvec{c}}}_{{varvec{i}}}^{boldsymbol{^{prime}}}{{varvec{beta}}}_{2}+{{varvec{ x}}}_{{varvec{i}}}^{boldsymbol{^{prime}}}{{varvec{beta}}}_{4}+{varepsilon }_{i}$ $

(8)

in this case, ({{varvec{i}}{varvec{n}}{varvec{c}}}_{{varvec{i}}}^{boldsymbol{^{prime}}}) is a vector with dummy variables for neighborhood average income level quartiles and ({{varvec{x}}}_{{varvec{i}}}^{boldsymbol{^{prime}}}) is a vector of educational covariates.

Because we want to explore the heterogeneity of the association between metro distance and metro distance, we include the interaction between metro distance and neighborhood income level:

$${Y}_{i}={beta }_{0}+{beta }_{1}{d}_{i}+{{varvec{i}}{varvec{n}} {varvec{c}}}_{{varvec{i}}}^{boldsymbol{^{prime}}}{{varvec{beta}}}_{2}+{d}_{ i}times {{varvec{i}}{varvec{n}}{varvec{c}}}_{{varvec{i}}}^{boldsymbol{^{prime}}}{ {varvec{beta}}}_{3}+{{varvec{x}}}_{{varvec{i}}}^{boldsymbol{^{prime}}}{{varvec{ beta}}}_{4}+{varepsilon}_{i}$$

(9)

in eq. (9), the vector ({{varvec{beta}}}_{3}) captures the degree of heterogeneity in the association between distance to the metro network and riot density across income quartiles.