I’ve been doing some more playing with the fire data from my last post to try and identify some aspect of the data that might help explain the pattern. My first step was to localize the fire data to specific counties. This information does not exist in the original data set, but it does contain latitudes and longitudes. I was able to map these latitudes and longitudes to a specific California county using the over function in R (from the sp package).
Looking at a choropleth map for 2007 shows that Riverside County had a significant number of fires, almost a third of the of fires that took place in California that year.
Inspired by a post on Nathan Yau’s excellent data visualization blog Flowing Data, showing a calendar view of fatal accidents, I thought I try creating a calendar map with some wildfire data I have been playing with (available from NIFC). This data set includes fire data going back to 1972.
Starting with R calendar map code by Paul Bleicher, I created plots of fires in California for various years.
I’ve been working on some research with Robb Dunbar of the University of Minnesota – Rochester that looks at using group or ‘pyramid’ exams as a cooperative learning technique. In this model, students take each exam individually. They then re-take the same exam cooperatively within a small group. Their overall grade is a weighted combination of the two grades.
Analyzing this kind of data presents an interesting challenge. The group’s score is not completely dependent on the individual scores. The group members interact to determine their group answers in ways that can have significant effects on the group score. In playing around with the data, I ended up creating a chart that highlights patterns in the data that were not visible with standard techniques.
The first thing to notice is that most students improved on the group test. The other thing is that it looks like there are three types of groups:
Type 1: The group score is (more or less) equal to that of the highest individual. This group may be characterized as having a strong leader – the highest scoring student asserts that his/her answers are correct and others follow
Type 2: The group score is less than highest individual score, but greater than all other individual scores in the group. This group may be characterized as having a negotiating leader – the highest scoring student asserts that his/her answers are correct but others may disagree. The negotiated answer may not be correct.
Type 3: The group score is higher than all individual scores in the group. This group may be characterized as one where everyone works together to figure out which answer is correct. The highest individual score in this type of group tends to be lower than that of the other groups. Maybe their strategy is the result of the recognition that no one student has all of the correct answers.
Further research is needed to figure out if this pattern holds up and if the descriptions make sense. However it is an interesting example of how visualization can reveal patterns that go unnoticed.