Very Interesting Map Of Comments In BB Stimulus Proceeding

In my capacity of consulting with the Benton Foundation, I have been doing work with Kate Williams, a professor of informatix at University of Illinois. Williams has been doing some (IMO) critical work around broadband sustainability. In particular, Kate has been studying the old Technologies Opportunity Program to determine which projects had lasting impact and which didn’t — a rather important consideration for the new and improved BTOP program.

But what caught my attention recently is this very interesting map that Williams compiled based on the comments submitted to BTOP. It places the comments filed on a geographic map, with links to the actual comments themselves. The map includes the 58% of comments filed by the April 13, 2009 deadline which contained reliable information on the location of the commentor. The remaining 42% either gave no location or included location in an attachment which Williams considered insufficiently reliable to determine location.

Why do I find this interesting? Because it potentially provides a very interesting cross check on the state of broadband geographically, as well as who follows these proceedings. I have long lamented that the FCC (and other federal agencies) make so little use of the data they actually collect. At best, an agency may note submission by a class of commentors (e.g., broadcasters, MVPDs, ISPs) in the specific proceeding at issue. But no one tries to take the multiple data sets collected as comments in each proceeding, or in multiple proceedings, and tries to determine patterns and what they might suggest. williams grouping by geography is intriguing, and I cannot help but wonder what would happen if we applied a similar analysis to multiple FCC proceedings — including for comments generated by mass “comment engines” that have become common in some high profile proceedings. It would be very interesting to know, for example, if the people feeling passionate enough about media consolidation or network neutrality cluster geographically and, if so, do we see patterns of geographic interest which might tell us about the actual situation on the ground.

Of course the sampling from comments is not a pure scientific data set in that to comment, a commentor must (a) know about the proceeding, and (b) feel strongly enough to file comments. But the fact that the information has a particular set of biases does not render it meaningless, especially if one controls for this.

I hope researchers use Williams’ map, both to analyze the BTOP comments and as a model going forward for analysis of other proceedings.

Stay tuned . . . .

  1. Dan says:

    Harold, this is certainly interesting, but as you note it is not a randomized sample.

    Even if it were, however, there are other technical issues that arise when trying to understand what if anything these data tell us.

    For example, total population does not cluster evenly across the country, so if one wanted to evaluate clustering, one should try to define a standardized geographic unit of measurement (county, census block, whatever), within which to evaluate the clustering of comments (possibly on a per-capita basis, depending on the question one is trying to answer, or perhaps a percent-share).

    Another question I have is whether this represents some actual measurement of geographical differences in conditions on the ground, or is there the potential of some systematic skew due to grass-roots mobilizations as well as industry group propaganda.

    965 data points is not a tremendous amount of data when one is trying to get a regional breakdown across the country. Even with a random sample, this sample size is kinda small if you want to get detailed. Assuming one is trying to look at it statistically, the standard error may yield insignificant results for any region small enough to be interesting.

    Finally, what hypotheses exist with regard to geographical clustering of opinions about broadband policy? What questions are you thinking about that these data might be mustered in the service of answering?

    If one had a particular question, then the comments could be coded quantitatively with respect to that question, which requires some judgment, but such judgment can be justified with a structural argument that is reported transparently in any report of such analysis.

    But unless one has a question already defined, there is no context in which to code free-form open-ended comments, so the hypothesis is a prerequisite to any relevant quantitative analysis.

    I’m intrigued by the idea, and visual presentation of quantitative information can often be very powerful, but what exactly do you propose that such geographical data can help illuminate?

  2. Harold says:


    It would certainly help to have maps like this for multiple, related proceedings to provide some comparisons.

  3. Dan says:

    Yes, but I’m interested in what comparisons you envision for this, and I want to make sure it’s an apples-to-apples comparison, whatever it is.

    When I look at the map, what I see looks like an echo of the general clustering of total population in big cities on the coasts and Great Lakes, with somewhat higher density in the middle to the east of the Mississippi River (lots of mountains to the west, and desert).

    In that sense, the clustering could be almost even on a per-capita basis (that is, everything could be essentially the same, within statistical error). What I’m more interested in is how this geographical clustering might actually differ in some way from the overall clustering of general population.

    Also, things like urban/rural, center-city/suburban, etc. Or even, each commenter that submits an address can be associated with a census block, each of which is known to have certain demographic characteristics, and then the total set of such comments can be analyzes not according to mere region, but according to local regional characteristics.

    I’m just trying to figure out what comparisons are useful, and what might be relevant to FCC proceedings in particular.

    Maybe a pop-balanced cartogram would be more interesting than the raw geographical map as a foundation for the comment distribution, just as a starter.

    Not trying to be difficult here, but rather exploring what would make this genuinely compelling as a tool to understand what’s really going on on the ground.

