Dataset Documentation
Right now the Facebook Project is happy to offer researchers free access to Jeff Ginger's datasets from 2006 through 2008. Please make sure to read the documentation below to understand the methods used to collect each set as well as their limitations. Note that the quality of data gets better with each year.
- 2006: The Preliminary Study (n = ~73)
- 2007: Revision 2 (n = 75)
- 2008: Interviews (n = 26)
- 2008: The Soc 100 subject pool (n = 700+)
- Miscellaneous
The 2006 Survey Cluster
Is not available publicly yet. Jeff has still yet to obtain retroactive permission from the IRB for the public release of this data - it was collected when he was an undergraduate for a class project.
Sampling
The sample was collected in April and May of 2006 and gathered responses from a convenience sample of 124 students (73 UIUC undergraduates after filtering). The population surveyed was far from representative, as it consisted of the responses to a mass message invitation sent out to the Jeff's entire friends list, approximately 700 people at the time. The group was invited to participate in whichever surveys they wished and each survey took identical usage and demographics data. Participation was voluntary, anonymous and contained questions pertaining to perceptions on trust and privacy, meeting people and relationships ( identity management), messaging, pictures, and groups.
Limitations
The study design was problematic - different users took different surveys with no consistency or tracking between. That is to say if one user took the pictures survey there was no way to tell if they also took the messaging survey. Users were not forced to log in, so they could potentially take the survey multiple times from different IP addresses (computers or networks). Survey design itself was also an issue - some questions were poorly worded, and others had answer sets that were not mutually exclusive and/or exhaustive. They both employed insider terminology without explanation (would you 'friend') and may have made assumptions about the participants that should not have been made. In all the endeavor was a good exploration and exercise in research, but should not be considered generalizable, even for the UIUC population.
Available Files
- The brainstorm sheet for the original survey questions (close to final listing) (.doc)
- Interactive graphs and summaries of the findings (to be linked after approval)
The 2007 Survey: Revision 2
Data yielded from a survey conducted during Jeff's first year in graduate school, this survey was based in many ways off of several of those from the 2006 group. In the end it tried to accomplish too much and ask too many questions, but did manage to secure some reasonable findings.
Sampling
The Revision 2 survey was sent out over the summer of 2007 to a formal, randomly selected portion of the undergraduate student population. All respondents were full-time degree-seeking students over the age of 18. The decision was made to exclude part-time and non-degree seeking students after it was determined they were statistically more likely to be of a significantly older age and only comprise a minimal, outlier population at UIUC. In total, the official university statistics department, the Division of Management Information, pulled an 1100 person sample randomly from the entire undergraduate student population. A mass email was then sent out to each of the selected participants and invited them to log in with their university ID to a secure survey form. No remuneration was offered nor were participants required to pay any money to participate. The response rate to this survey was very poor due to a survey response limit mistake1 as well as the sheer length of the survey and technological limitations2 that prevented collection of partial or specific responses. All told only 75 students (a pitiful 7% response rate, 2 did not have Facebook accounts) fully completed the survey, which effectively means the data are not generalizable to the overall student population (to a statistically significant degree).
[1] An ambiguous category for the number of responses was embedded amongst questions pertaining to per respondent limitations – I initially mistook it to be the number of times a single respondent could fill out the survey.
[2] The DMI required the use of a University-built survey builder application that did not allow for skip logic or multiple user pathways, nor did it capture responses of partially filled out surveys.
Limitations
Besides the aforementioned sampling problems, the survey had some minor problems. Several items were not necessarily directly comparable (notably, asking respondents what they would publicly announce as compared to what they list on Facebook) and others collapsed answers too much (number of friends was set into ranges and some activities were multi-facet, like "read, reply to, or compose messages"). Unfortunately the low response rate makes demographics-related study almost impossible as well - as minority respondents in terms of ethnicity and sexuality were too small in number to use effectively in running statistics tests.
Available Files
- IRB #07570 approval (.pdf)
- Survey questions (.doc)
- Full dataset for Excel (.xls)
- Full dataset for SPSS (.sav)
- Frequencies (.doc)
- Measures of central tendency - a very long table (.htm)
- Summary comparison between 2006 and 2007 - usage and demographics (.docx)
- Explanation of the assertive-activity (assertivity) index (.doc)
Meaning and Purpose in Everyday Life: Interview Sessions
In general these were a set of many open-ended question interviews focusing on the everyday meaning and purposes Facebook serves in student life. I asked questions regarding their uses of the SNS, their technological history, ways they become informed by the website, ways the communicate with the website, and ways it relates to ties (indirectly). I also later jumped into impression management and much more interesting questions relating to insider understandings, like what they consider to be creepy (stalker) use of FB or who they think their invisible audience is or how information they find out on FB influences their face-to-face life.
SO I have yet to transcribe these. Also my question set went through several evolutions as the study progressed - I hope to explain this progression in detail at some point in a paper.
Sample
I recruited a small convenience sample from three Sociology 100 classes. Sign up was voluntary and open to undergraduates over the age of 18. Students who participated were given a small amount of extra-credit in the class. I had originally planned to only do 16 interviews but after having sign up open for a day I was already flooded with too many requests. I did as many interviews as I could before the end of the semester, in total coming to 26 with a wide-variety of students, in effect accomplishing a bit of a theoretical sample. The IRB requires you very carefully detail your procedures to to really do an effective theoretical sample I would have had to have been more selective and invasive with students and thus increased the IRB turn around time as well as caused a drop in my numbers. I instead just went for as many people as I could in hopes that I would gather a wide variety of individuals and perspectives. Not every respondent was hugely into Facebook (or even liked it), largely as a result of the extra-credit incentive. More girls than guys signed up, but not too many more. As Soc 100 is an entry-level general education course (fulfills a requirement) it was mostly Freshman, but from a number of different majors.
Limitations
The group was only from Soc 100 and all of them knew me as a TA for the course. And then the typical limitations for this sort of thing - the people who like to sign up for an interview right off the bat are usually more assertive and confident, and in this case they're probably more likely to be interested in Facebook. I didn't ask about demographic information outside of locale (if they grew up in an urban, suburban, or rural environment). Call me a bad researcher but I felt it would freak people out if I asked them how they identify ethnically or what their sexual preference is during a Facebook interview. The students talk about these sorts of things in their short assignments for the class but those aren't released to the public; I didn't feel good charging into private identity stuff or political or religious preferences in a casual interview. I fumbled my words often, asked questions in confusing manners, and probably revealed too much of myself to respondents (through laughter or just my own unintentional expressions of interest in FB). Essentially it was my first escapade in interviews and I'm sure I made more mistakes than I could imagine. Regardless I think the transcriptions and topic-related excerpts could prove very useful for researchers who'd like to pull material from them. Through all my issues I really did hit on some really interesting topics and issues when it comes to Facebook!
Available Files
- IRB #08427 approval (.pdf)
- Interview Questions (evolution 2 out of 3) (.docx)
- Full Transcriptions (.zip, .pdf) only 10 of 26 available at this time
- A helpful guide containing topic and question related excerpts
Revisiting Facebook Surveys
I plan to run one last Facebook survey this upcoming Fall (2008) with the new Soc 100 class (the Professor has approved it, IRB should be okay with it too, it's a really simple survey). The sample should be something like 700 respondents and addresses two main issues: similar demographic aspects to the subject pool I pulled the interviews participants from and the opportunity for people without accounts to weigh in. This is also a drawback, however, as people sometimes don't get accounts until a few weeks in to school.
Available Files
Miscellaneous
Other random data from Jeff. Only two fragments here for now:
Sampling
[2006 Grad data]
After they implemented the new search system back in 2006 you could no longer manually enter query strings into the URL box and they limited responses to up to 500 profiles. So what I did to deal with the interface limitation was to do more specific searches on the UIUC network, including, for this little Excel fragment, gender intersecting with political views and relationship status for all people who listed themselves as graduate students.
[Pseudo Digital Ethnography]
Content analysis (borderline autobiographical ethnography, just WAY too short of a time period of observation) and simple statistics data from a side-line project I did sometime in early 2008. I joined a rather outlandish Facebook group, There Are Some Things Guys Should Always Do For Girls. Period. and observed what was going on. It's not really what I'd call high quality work but makes for an interesting mix of multi-method study and an introduction into a pretty sexist Facebook group community.
Limitations
[2006 Grad data]
Of course I've got no idea how things overlap or even what exact day this was taken - the file is date marked 05.05.06. And then there's the big issue - not everyone lists their academic status. Interesting though - based on the count at the time approximately at least 1630+ graduates had profiles on FB at UIUC, out of a total of 9000 or so. Pretty small number compared to the 92% (or so) of undergraduates who did at the time.
[Pseudo Digital Ethnography]
The two biggest limitations, I feel, were the length of the study and my own biases in regards to feminism. I couldn’t help but read down the list and get angrier with each item. The group really seems to be structured around the contention of gender roles and I often felt myself sucked into the debate and feeling far from anything resembling scientific. Autobiography is becoming recognized as a legitimate method but honestly logging on and arguing (in notes or in actual discussion, it doesn’t matter) doesn’t really feel like real research to me. I tried to expand my entourage of negative reactions by clinging to content analysis and really basic statistical observations at first, in an attempt to avoid losing myself in the setting. It seems like most people who take on ethnography find the most unusual exotic culture they can and jump in head first, here I was poking around a group on a service I’ve spent the last two years studying that I happened to whole-heartedly take issue with. It seems to implicitly raise the epistemological issue of conflict of interest. Regardless, the exercise did unearth some worthy findings, particularly those related to interface in general that I might take with me into the future. I present this study with full knowledge of its lack of viability and long-term analysis as an opening exploration.
Available Files
- The UIUC Graduate Student spreadsheet (.xls)
- A collection of all of my fieldnotes, memos, artifact analysis and statistics from the Pseudo Digital Ethnography (compressed in a .rar, download winRar here)