Saturday, July 27, 2013

Ethnic Diversity and Human Rights

It turns out that much of the CIA World Factbook data is not in a format that lends itself to easy analysis. In particular, ethnic makeup is listed a set of ethnicities followed by the percentage (if available) for that ethnicity. For this article, I only took a look at the largest ethnicity in each nation, and what proportion it was of the whole population.

To clarify, I'm not looking in this article at 'white nations', 'arab nations' etc. I'm looking at 'diverse nations' vs. 'homogenous nations'. For reference, the United States has an 80% ethnic majority (white).

I wanted to see if there was any difference in the diversity level of countries in each rating level. One way to do this would be to show histograms for each of the different ratings, but overlaying 7 histograms is pretty messy, and putting them side by side is hard to compare. Instead I'm showing cumulative proportions for multiple lines. The way to interpret this is that for each point, y% of nations in that rating have x% or less largest ethnic share.




I had expected that countries with the worst human rights abuses would have moderate ethnic majorities, and that was more more or less shown in the data. It turns out that the countries with the best human rights records also had clear ethnic majorities. In fact, the worst countries and the best countries are more similar in terms of the distribution of largest ethnic share than they are to the countries in the middle of the pack!

I suspect that this is a sort of convergence. A lot of the countries with the best human rights records are wealthy industrialized nations where most minorities are immigrants. Many of the countries with poor human rights records are developing nations where one indigenous group gained dominance through conquest, and there is a long history of ethnic conflict. 

One word of caution - there aren't that many data points here, the worst example being civil rights ranking=7, with only 7 data points. 

Saturday, June 22, 2013

Is the World Becoming More Free?

As part of the look I'm taking at Economies, Religion, and Human Rights, I've started taking a look at the historical data FreedomHouse.org makes available. 

From freedomhouse.org:
Freedom in the World, Freedom House’s flagship publication, is the standard-setting comparative assessment of global political rights and civil liberties. Published annually since 1972, the survey ratings and narrative reports on 195 countries and 14 related and disputed territories are used by policymakers, the media, international corporations, civic activists, and human rights defenders to monitor trends in democracy and track improvements and setbacks in freedom worldwide.  The Freedom in the World data and reports are available in their entirety on the Freedom House website.

Basically, it is a ranking from 1 (best) to 7 (worst) for both political freedom and civil rights, and they use this information to classify nations as "free", "partly free" or "not free". I haven't yet loaded data on nation populations, so these are just based on the number of nations for each rating. 

First, political freedoms:
It's pretty clear that as far as political freedoms go, the world has gotten better. I'm a bit surprised to see how gradually the world changed. The only bit of discontinuity looks to have happened around the dissolution of the USSR. Since this is based on number of countries rather than population numbers, I think it's probably due to the USSR breaking down into many new countries, mostly with similarly poor political freedoms.

Now, Civil Rights:


With civil rights, there are a few more interesting periods. There's a big improvement in the early 90s for nations with poor civil rights, and another large movement in the early 00s for improving civil rights in countries with a fair amount of civil rights already in place. My guess is that the movement in the 00s is related to LGBT rights movements, and that the movement in the 90s is again related to the dissolution of the USSR. I would welcome comments from the more historically literate on other potential reasons for these changes.

Next up, creating choropleth maps to show locations for all these changes. Super-bonus-plan is to make a .gif that shows the changes from 1973-2012.

References:
http://www.freedomhouse.org/report-types/freedom-world

Macro: Load and Append Split Files

Sometimes it's necessary to split data files up. Maybe there's additional data points rolling in as time passes, or you're sending the data to a contractor and the file sizes can't be too big. When receiving these files, it's useful to use the APPEND procedure to put them back together.

I recently had a set of five or so such files, so instead of loading each individually, I built a macro to load, append, load, append, etc.

It's not a plug 'n play macro, since adjusting the variables, formats, and base_data_set are required, but here's the skeleton of it:

/*
Load a series of files, and append them to the base dataset.
*/

%macro append(fname);
data temp;
infile &fname dsd truncover;
input 
var1 format1.
var2 format2.; ;
run;

proc append base=base_data_set data=temp;
run

%mend append;

/*
Load and append 5 files..
*/
%append("file1.csv");
%append("file2.csv");
%append("file3.csv");
%append("file4.csv");
%append("file5.csv");

This is a lot cleaner than writing a separate data step and append proc for each new file!

Friday, June 21, 2013

Economies, Religions, and Human Rights (Part 1)

One idea I see brought up by a lot of New Atheists is that increased religiosity is responsible for worsened human rights. It's a damning claim, and one that's rarely substantiated. Despite being an atheist myself, I think it's unfair, and probably false (or at least only coincidentally true). My hypothesis is that income equality and per capita income are drivers for religiosity (inverse) and human rights. While there's no way to show causation for something like this, I do think it's worth actually looking at the data.

Data sources:
Human rights scores: Freedom House.
Income equality: Gini coefficients from the World Bank.
Per capita income and distribution of religions: CIA world factbook.
Religiosity: Gallup Worldview

I'm pretty excited about having skills and time to do this sort of project now. I've wanted to build some sort of model relating some of these factors ever since I took a political science course at ASU with Miki Kittilson. That class was where I was first exposed to most of these data sources.

Easy Deciles with PROC RANK

One of the tasks I'm working on right now requires creating deciles for a large data set. I'm generally more comfortable using sql queries than the built in SAS functions, but it turns out SAS has something really good for this - the RANK procedure.

Apparently, most programmers use data steps and macros to create deciles, but I honestly can't see why.

PROC RANK data=<dataset> out=<output dataset> groups=<num_quantiles>;
 var = <list of variables to rank>;
 rank = <list of column names for ranks>;
run;

That's it. So basically, you can use this to create deciles (or other quantiles) on an arbitrarily sized data set for an arbitrary number of variables to quantize.

Reference:
Jonas V. Bilenas, JP Morgan Chase Bank, Wilmington, DE, Using PROC RANK and PROC UNIVARIATE to Rank or Decile Variables. NESUG 2009