We recently analyzed the reputation of a country’s Internet (IPv4) addresses by examining the number of blacklisted IPv4 addresses that geolocate to a given country. We compared this indicator with two qualitative measures of each country’s governance. We hypothesized that countries with more transparent, democratic govermental institutions would harbor a smaller fraction of misbehaving (blacklisted) hosts. The available data confirms this hypothesis. A similar correlation exists between perceived corruption and fraction of blacklisted IP addresses. We provide details of our data sources and analysis below.
Corruption Perceptions Index (CPI)
Transparency International is a non-governmental organization that monitors and publicizes corporate and political corruption in international development. It publishes the Corruption Perceptions Index (CPI), which combines 13 different surveys and assessments to create a single CPI score. The CPI generally defines corruption as “the misuse of public power for private benefit”, and it ranges from 0 (highly corrupt) to 100 (very clean).
Internet-reputation Indicators
As part of their efforts to fight spam, Internet Service Providers, governments, and security organizations share lists of Internet IP addresses identified as having been used to send spam. One of the largest warehouses of this information is the Spamhaus Project , an international organization dedicated to tracking spammers and spam-related activities. Spamhous runs the Composite Blocking List, a list of IP addresses suspected of spam or botnet activity. The CBL project publishes a breakdown by country of the number of blacklisted IP addresses, normalized in two ways: by the number of IP addresses allocated to each country (IPop %) and by the number of Internet users (Infected %) in each country (estimated using the CIA World Fact Book). We also use the CIA World Fact Book to determine each country’s Gross Domestic Product (GDP), which we use to size each circle in the plot.
We used a logarithmic least square fit (red line in plot) to assess correlation between the CPI and these two Internet-reputation indicators. The plot shows that the CPI correlates more strongly with the first ratio than the second. (The first ratio attempts to grossly estimate the number of infected users, while the second estimates the number of infected IP addresses or hosts.)
Democracy Index (DI)
We also examined the correlation between governance type and the number of blacklisted hosts. The Economist Intelligence Unit provided a Democracy Index (DI) in 2011, based on 60 indicators in five categories: electoral process and pluralism; civil liberties; functioning of government; political participation; and political culture. The index ranges from 0 to 10, with higher values indicating higher level of democratic governance and culture. This range is further divided into authoritarian regimes (0-4), hybrid regimes (4-6), flawed democracies (6-8), and full democracies (8-10).
In contrast to the CPI, which sees a (negative) correlation with blacklist prevalence across the full range of governance type, the DI scatterplot was divided into two governance regions (on the x-axis). In authoritarian and hybrid regimes, there was little correlation with prevalence of IP addresses on the CBL relative to user population, and almost no correlation with prevalence on the CBL relative to that country’s total IP address allocation. (Many of these countries have very small address allocations, so it does not take many misbehaving IPs to make this latter fraction approach 1.) Interestingly, the correlation is much stronger for the two types of governance regimes on the right of the graph: flawed democracies and full democracies. Both types of governments saw a much stronger (again, negative) correlation between the two Internet-reputation indicators, with the number of blacklisted hosts decreasing as their DI value increased.
Combined CPI and DI
We also compared the two transparency metrics — CPI and DI — using the scatterplot below, and coloring each dot according to the value of its first Internet-reputation metric (number of blacklisted hosts/ number of Internet users). The plot reveals two “zones”: a High Reputation Zone with goverments that are either sufficiently democratic (DI above 6.5), or perceived as sufficiently transparent (CPI above 45), which also tend to have fewer infected (blacklisted) IP addresses per capita, and a Mixed Reputation Zone with both low and high Internet-reputation values.
Conclusion
Of the two measures of a country’s governance reputation we examined, the CPI was more correlated than the DI to misbehaving (as measured by having been blacklisted) IPv4 addresses in a country. However, for democratic goverments, the DI was even more correlated than was the CPI with our two Internet-reputation indicators.
In the future we hope to investigate whether these correlations hold for other blacklist or attack data and other indicators of reputation both in meatspace and on the Internet. We would also like to find other metrics that differentiate among countries in the Mixed Repuation Zone. More