The Majority Line: Introduction

Project Summary: The Majority Line is an analysis of all Congressional roll call votes as a non-ideological social network using the methods and budget of an amateur computational scientist. For the most part, each Congress and chamber is viewed in isolation, with occasional spotlights on historically relevant figures. Analytical methods used will be uniform across all Congresses and chambers. Results will be displayed in an interesting visual manner when possible. Monetary budget of equipment and software used is less than $2000 (mostly in the cost of a refurbished laptop and Microsoft Office that are not even 100% dedicated to this project). Time spent is on the order of less than a year of spare time on nights and weekends. The main questions we'll attempt to answer are: (1) Who was part of the majority (technically, plurality) the most? The least? Who could get the most members to agree with them? Who ended up with the fewest likely to agree with them? Who agreed with each other the most? The least?

Post Summary: This post is an introduction to the project.

You could say this project started in the early 80s, when I would watch C-SPAN on my parent's satellite dish. At the very least, one of the goals of this project is to deliver something that a similar youngster, with a passing interests in the mechanics of politics, would enjoy. While I never pursued politics - or political analysis - as a vocation, the ability to attempt a project of this scale arose naturally from my background of interests and skills, the recent acquisition of a new laptop, and a desire to learn more Visual Basic for Applications.

What is this project? It is an attempt to deliver a neutral, non-partisan assessment of the voting outcomes of Congress, using techniques on the order of what would be used to analyze a season of baseball (or football, or some other sport). The intention is to have a leaderboard for various metrics, and to show graphics that are able to characterize the activity of Congress "at-a-glance." Think of it as an Almanac of Congressional Roll Call Outcomes.

Take a look at the graphics at the Chronical of Higher Education showing the relations between universities, as measured by their assessment of peer institutions or the graphics at Popcharts or in Edward Tufte's books and you will see the target I am aiming for in terms of visual display of the quantitative information.

Other constraints on what I intend to present are (1) no assessment will take into account the party of the the representative, (2) the relationships among the members of a particular chamber of Congress will be emphasized, (3) the chamber as a whole, but also the individuals that make up the chamber, should be easily visualized, (4) all Yea/Nay votes will be counted equally, and (5) similar techniques should be applied across all 112 (now 113) Congresses.

These constraints introduce problems. By removing the party designation we remove quite a bit of passion and interest from the subject. But perhaps by removing the party label we will discover what the true relationship is between and among the various members of Congress, and what group forms the Majority in the old saying "The Majority Rules." A chamber with over 400 members is difficult to visualize to see both the whole and each individual representative. The same technique applied to the House of Representatives of the 112th Congress may not work for the House of Representatives of the 1st Congress. But we will try to maintain uniformity among methods, and I've done most of my initial testing on data for the 1st and the 112th Congresses, in order to explore the extremities of the ranges of data. By using all roll call votes, or rather by not selecting important votes, we are treating a vote for naming a post office the same as the vote for defense appropriations, and also the same as some other (or major) minor procedural issue. But selecting important votes takes away from our neutral stance. Other websites are already selecting votes and assessing a representatives agreement (or disagreement) with these particular issue votes.

By emphasizing the relationships among members, we can utilizing the growing set of tools built for social network analysis. Adrien Friggeri has performed some interesting analysis (and presented a wonderful graphic) looking at agreement groups in the US Senate. We hope to do this for both the House and the Senate, though using simpler analysis. We will not be delving into Friggeri's community detection algorithm, or Poole and Rosenberg's DW-NOMINATE algorithm, though we will be using a measure that is similar to Poole's Percent Voting On The Winning Side. On a side note - while Poole has done much of the heavy lifting coding the roll call votes that make this project possible - one of the reasons I am taking this up is that I think his measurement of ideology is not working very well. For example, if you look at his Rank Ordering for the 110th Congress you see a definite bifurcation among the parties, yet I get a Majority that includes Senators Snowe, Collins and Smith, while Senators Clinton, Dodd, Obama, and many others rank low enough to be considered the Minority along with Senators Stevens, Lugar, and many others. (My Majority Index correlates much more strongly with his Percent Voting On The Winning Side.)

So what about that analysis? The core of what we present will be based on two measures: (1) The Majority Index, and (2) The Relationship Array. These terms will be more carefully defined in a follow-on post. Computing Majority Points is like Poole's winning side data, with some subtle differences. We consider the winning side to be a majority (plurality, if appropriate) so we do not pay attention to the number of votes needed to "win" the vote. We assign a Majority Point of 1 to a member for a particular roll call vote if the member votes with the majority, -1 if they vote against the majority, and 0 if they do not vote, or vote present, or abstain, or if the question requires something other than a Yea or Nay vote (like electing the Speaker). Summing across all votes where the representative was a participant, then normalizing to a range of -100 to 100 (-100 if always voting in the minority, +100 if always voting in the majority) gives us the final Majority Index for a representative in a particular chamber in a particular Congress. The Relationship Array is computed by treating the set of votes for a representative like a vector, then computing the norm of the differences between two representatives through their voting vectors, providing a percentage of agreement between each pair of representatives (bear in mind that both representatives abstaining from a vote would be considered in agreement). The vector norm is computed only across votes where both members voted. If only one member voted, that roll call vote is not consider in the computation. The Relationship Array defines a symmetric, unidirectional social network.

From this percentage of past agreement, we can assume it forms an estimate of the probability of future agreement, then perform a Monte Carlo simulation of votes in order to compute a typical number of voters that would agree with that representative based on past percentages of agreement. This is a measure of the ability to gain agreement from colleagues, and is normalized by the number of members of the chamber to produce a percentage.

Given the three-fold perspective of (1) what is important is the individual rather than the party, (2) the majority rules, and (3) only voting outcomes matter, the over-all goal is to present a clear picture of who is ruling, who is opposing, who is left in between, and how pairs of representatives agree with each other.

The Majority Line

Thursday, January 3, 2013

Introduction

No comments:

Post a Comment