closedistances (closedistances) wrote in cmanthropology,

Virtual Ethnographer’s Toolkit: a software fantasy

As researchers of cyberspace, virtual ethnographers approach social software with different needs than the typical user. While the typical user may only seek out conversational technologies for communicative purposes, the virtual ethnographer tends to ask questions about the patterns of online communication. Like in traditional ethnography, there is no substitute for “being there” doing participant observation. After all, virtual ethnography “is not so much a method in itself, but is often a way of applying in a new context…various [other] methods” (Bird & Baber 2003: 130). But unlike traditional ethnography, the raw data already exists in digital format, and should be easier to analyze with the proper software. However, as someone just beginning my dissertation research which will include a virtual ethnography component, I have found myself wishing on more than one occasion that I had software capable of automating certain tasks. With this in mind, I want to use this entry to imagine a software package, which I will call “The Virtual Ethnographer’s Toolkit” (VET for short) that would be able to perform the tasks that existing programs do not seem able to do. (If anyone either knows of existing software packages that can perform these tasks or is interested in creating such a program, I invite you to respond to this entry.)

One of the problems that I find with some of the existing software that would be of use to virtual ethnographers is that they are specific to one particular conversational technology. I am thinking specifically here of Netscan, which is very useful for analyzing Usenet, but cannot be used for any other conversational technologies. I even e-mailed Microsoft to ask if there were any plans to expand Netscan’s capabilities to blogs and message boards, or if they could direct me towards any other software with this capability. Unfortunately, the answer was no in both cases. I also ran across the webpage of an HCI professor who that mentioned plans to develop a program based on Netscan that looked to be exactly what I was looking for. However, when I e-mailed him about it, he told me that the program never materialized. To overcome these sorts of problems, VET would be able to apply its analyses of virtual communities to multiple conversational technologies, and would be able to import modules that would allow it analyze new conversational technologies as they developed. Common conversational technologies that exist now include:
Static and database-backed web pages
Discussion forum
Internet chat/instant messaging
Video and audio streaming
Video and audio conferencing
Weblog (“blog”)
RSS [Green & Pearson 2005: 2]

Netscan allows automatic graphs to be generated for the number of posters, returnees, posts, and so forth. I think being able to do the same thing for e-mail listservs, discussion forums, and blogs would be quite useful to virtual ethnographers interested in questions of whether the group has a stable membership and whether participation is increasing or decreasing.

The second feature I envision VET having is being able to automatically extract tabular data from social software. Many different types of social software allow users to create profiles of some sort, and some feature community profiles of some sort (e.g. how many visits and posts were made that day). VET would be able to automatically collect this data and export it into a format of your choice so that you could use an appropriate program (Excel, SPSS, etc.) to statistically analyze it or create charts based on it. If that data happens to include geographic information, VET would also be able to geocode it so it could be imported directly into a GIS program.

Third, VET would be able to function like a “meta-search engine” for the various types of social software, allowing the virtual ethnographer to get a variety of background information on a particular virtual community without requiring the researcher to know about and visit all the websites that can provide such information. If the virtual ethnographer wanted to analyze a website, for instance, VET could search for the specified website on Alexa, the Internet Way-Back Machine, Textalyzer, and Whois domain explorer, and then generate a thorough background report on the site based on the information it gathered.

Fourth, for synchronous conversational technologies, VET could function like a bot. This would allow the researcher to keep archives of conversations, and the bot could be programmed to conduct interviews, thereby saving the virtual ethnographer time and allowing participants who keep different hours to be reached. These interviews could either be just a straightforward list of questions asked in order, or the bot could be programmed to ask follow-up questions based on the responses, similar to ELIZA. While programs like ELIZA tend to be “dumb” and cannot pass for an actual person, it has the potential to provide “a reassuring encounter with an almost-other” (Turkle 1995: 109) thereby encouraging participants to be more talkative. Unlike ELIZA, however, the VET bot would limit the number of follow-up questions. The virtual ethnographer would, of course, have to be familiar with the cultural context of the virtual community before deciding to use this feature since reactions to ELIZA vary (Turkle 1995: 105-109).

Fifth, VET would be able to automatically generate texts to analyze based on certain criteria. There are (at least) three ways that text analysis may be done: (1) content analysis, where repeated observations of themes or content within a body of text leads to the development of analytic categories (LeCompte and Schensul 1999: 129; McCarty 2005: sec. II-A-2-I-A), (2) concordance analysis, which is the systematic transformations of textual data to “direct your attention to the immediate linguistic environment of the specified word” (McCarty 2005: sec. II-A-2-III-C), and (3) statistical analysis, which “involves counting particular features of the textual data and then applying one or more mathematical transformations” (McCarty 2005: sec. II-A-2-I-A). VET would be able to aid all three of these not only by compiling all the communication within a virtual community into a single body of text, but being able to recognize and separate particular users and thematic categories to generate a body of text for analysis. For example, if I wanted to do a textual analysis on, VET would be able to generate a text file consisting only of posts within the “Technology” category, only posts containing the phrase “virtual ethnography” within them, or only posts by Rex. These texts could then be exported into programs like Concordance or Weft QDA.

Sixth, because it is important for “critical cyberstudies” to look at discourses about online discourses (Silver 2000), VET would allow the virtual ethnographer to search for what is being said elsewhere about a virtual community. VET could see what is being said about a particular virtual community on television, radio, newspapers, magazines, books message boards, and blogs through searching Lexis-Nexis, Technorati, Google Print, Board Reader, Blinx, and Google News. A virtual ethnographer interested in a particular LiveJournal community would probably not find anything about that specific virtual community, but could probably find discourses on LiveJournal and blogs in general to help contextualize his or her research.

Seventh, VET would be able to aid in analyzing the social networks of virtual communities, having the capability to generate a sociogram, which is “a visual graph of the network that will help you clarify its characteristics” (Wolfe & Hagen 2002: 148). VET could also generate data files for programs like UCINET for more complex analyses. VET could automatically identify network connections in a variety of ways. One way is to keep a running tally of which individuals reply to each other. Replies should be easily identifiable in conversational technologies that represent replies through a tree-like structure (like LiveJournal, Google Groups, and certain styles of EZboard), although they may also be identified through subject titles that begin with “re:” and by the quoting of another user’s text. When VET encounters posts with “re:” titles and/or quotes, it will automatically search for messages within that thread to see if they can be connected to a single message within that thread. Discovering social networks within chat rooms would be more difficult since there is no tree-like structure and no subject titles; however, users are likely to quote each other’s names and text in order to make it clear who they are addressing. It may be problematic that users could communicate each other without these easily identifiable indicators for VET to pick up on, simply relying on the context of what is being said for the receiver to know they are the intended recipient of the communication. Should these gaps be of concern, the virtual ethnographer could manually add these communicative acts to VET’s tally of who is conversing with whom, and also “contact each individual in the network…[and p]rovide each member with a copy of your list and ask each of them to indicate every individual that they have regular communication with” (Wolfe & Hagen 2002: 146).

Unfortunately, the Virtual Ethnographer’s Toolkit exists only in my imagination (and perhaps yours as well now). However, in imagining VET with these features, I tried to make it plausible that something like this could be developed, given what I know of software that already exists. Even if VET were to be created with all the features I listed, it would still have trouble handling data that is not text-based such as podcasts and photo blogs. Also, VET would merely be a tool for virtual ethnographers, not a replacement for them. A great software package alone cannot guarantee insightful analysis. However, with better tools, virtual ethnographers may be aided in making insightful analyses of the virtual communities they study.

If you have ever done a virtual ethnography, I invite you to participate in this fantasy and add whatever features you think VET should have that I did not already think of.

References Cited

Bird, S. Elizabeth, and J. Barber
2002 Constructing a virtual ethnography. In Doing Cultural Anthropology: Projects for Ethnographic Data Collection. M. Angrosino, ed. Pp. 129-139. Prospect Heights, IL: Waveland.

Green, David T., and John M. Pearson
2005 Social Software and Cyber Networks: Ties That Bind or Weak Associations within the Political Organization? Proceedings of the 38th Hawaii International Conference on System Sciences.

LeCompte, Margaret D., and Jean J. Schensul
1999 Designing & Conducting Ethnographic Research. 7 vols. Volume 1. Walnut Creek & Lanham & New York & Oxford: Rowman & Littlefield Publishers.

McCarty, Willard
2005 Sources, topics and exercises for YEAR 1 2005-2006 Fundamentals of the digital humanities: Index to readings and reference materials by topic. Electronic document,, accessed October 16, 2005.

Silver, David
2000 Looking Backwards, Looking Forward: Cyberculture Studies 1990-2000. Originally published in Web.studies: Rewiring Media Studies for the Digital Age, edited by David Gauntlett (Oxford University Press, 2000): 19-30. Resource Center For Cyberculture Studies. Electronic document,, accessed October 10, 2005.

Turkle, Sherry
1995 Life on the Screen: Identity in the Age of the Internet. New York: Touchstone.

Wolfe, Alvin W. and Guy Hagen
2002 Developing an Electronic Ethnography. In Doing Cultural Anthropology: Projects for Ethnographic Data Collection. M. Angrosino, ed. Pp. 139-149. Prospect Heights, IL: Waveland.
  • Post a new comment


    default userpic