Content Coding

In organizational surveys, also often referred to as employee attitude surveys, data are gathered in two general forms, quantitative and qualitative. Quantitative approaches typically involve a statement (e.g., “Processes and procedures allow me to effectively meet my customers’ needs”) followed by a scale of response options (e.g., “strongly agree…strongly disagree”). This can be called a quantitative approach to measuring attitudes because the resulting data will be in the form of numbers.

By contrast, a qualitative approach allows free-form text to be entered by the person taking the survey. These are often referred to as open-ended questions or write-in questions, a term born in the days when surveys were typically conducted using paper and pencil and employees were given an opportunity to provide comments in their own handwriting. Today, many, if not most, surveys are administered using a computer, and comments are actually typed in.

The following are some typical examples of write in questions:

  • Please provide any comments you have on what it feels like to work here at Company X.
  • Please provide any suggestions you have about how Company X could better enable you to balance your work and personal/family life.
  • Do you have any additional comments or suggestions?

Content Coding Uses

The clarification of issues that are on the minds of employees, in their own words, is a powerful benefit of gathering comments in a survey. Although quantitative data can provide precision, trending, and easy comparisons to benchmarks, write-in comments add richness by bringing abstract issues to life through specific examples, new ideas, and suggestions for improvement.

Gathering comments also allows the researcher to gain insight into a new issue, one in which the important elements are only partially known. For example, a survey may contain a question about the most important characteristics of an employee benefits program. One might have some good hunches about benefits such as flexibility, health coverage, and dependent care. However, an open-ended question asking employees to describe what they would like to see in “the perfect benefits package” would provide a much more complete list of issues. Follow-up questions in later surveys—with a more tested list of issues— could then be used in a quantitative approach to monitor the effectiveness of the benefits programs.

Content Coding Techniques

Content coding of open-ended data is a process of reading through a set of comments or a subset of comments and describing themes that tie together many individual comments. These themes are analogous to the principal components of a factor analysis. Common themes in open-ended data depend on the questions being asked, but they will often include elements that are common to a workplace climate survey: compensation, management, jobs, workload, work-life balance, and business performance.

Some researchers ask the employees who are taking the survey to help code their comments while they are taking the survey. For example, a write-in question could be preceded by a list of topics that employees choose from to describe the theme of their comments. It is also possible to ask specifically whether the comment is generally positive or negative. This can help provide an initial structure to work with in developing themes.

One approach to developing themes is to keep a tally of issues mentioned as comments are read. Because of the ambiguity of the meanings of words and phrases, it is important to take a systematic approach to developing categories and coding comments. One simple approach is to have two raters set up initial categories based on their reading of the same portion of an open-ended data set. Through discussion of any differences in the taxonomies, a more stable and useful set of categories and coding rules can be derived for use in coding the rest of the comments. Reliability can be assessed through agreement among different raters on assigning themes to comments. If agreement between raters is low (e.g., less than 90%), then the researchers should review and refine their category list and coding rules and recode the comments until adequate reliability is achieved.

Because of the large size of some qualitative data sets, the task of reading through open-ended comments can be overwhelming. One approach to handling the volume is to sample a portion of the comments. If there are 2,000 comments, for example, reading through 400 of them (or every fifth comment) will usually give a good feel for the entire set of comments. This is similar to the notion of sampling as it is applied to quantitative results. Although the concept of a margin of sampling error makes little sense when applied to open-ended comments, redundancy will begin to appear after a couple hundred comments. By the time a reader reaches 400, few completely new suggestions or ideas will emerge.

Computer technology continues to evolve, and it offers some intriguing new techniques for deriving themes and understanding open-ended data. Researchers can use content-coding software tools, also called text-mining tools, to uncover themes within their qualitative data set. At a minimum, this provides an initial structure to use when reading through comments.

Content-coding tools require researchers to define certain terms and phrases, which the computer then uses to evaluate similarity across comments. For example, the terms CEO, president, chairperson, and even the proper name of a person, such as Jane Doe, may all have the same meaning in a given organization. Similarly, the terms compensation, pay, benefits, and money may be defined as belonging to the same category. With patience and an iterative technique for defining the right terms and phrases in a given survey project, text-mining software can help researchers define themes in an open-ended qualitative database.

This iterative technique also becomes the process whereby the researcher learns about and defines the themes. The results should be reviewed and the dictionary refined multiple times to produce the most useful data set. Elements that prove tricky to define include evaluations of goodness. For example, a comment that reads not very good needs to be defined as a negative comment. Sarcasm and irony can prove especially difficult for a computer to classify properly.

Content-coding software produces a kind of map representing the comments. Often this map will be structured visually, with large circles representing themes that contain many comments and smaller circles representing themes that are less commonly mentioned. These thematic circles can be connected by lines indicating their similarity or redundancy in content with other themes. The researcher now has a starting point for reading through the comments.

Content Coding Cautions and Recommendations

Exclusive reliance on open-ended comments is not recommended. Although write-in comments provide a richness of perspective that is not available with quantitative data, research has found that comments provide a somewhat more negative view of the state of a workplace climate than quantitative results. This is because people who are dissatisfied tend to use write-in comment opportunities more often than satisfied employees. Similarly, those who are dissatisfied with an issue will often provide more lengthy comments than those who are satisfied. Therefore, it is important to look to both quantitative and qualitative data— in addition to the surrounding context—to paint an accurate representation of a workplace climate.

Some researchers argue that open-ended qualitative data usually overlap and therefore are redundant with quantitative results. This calls into question whether it is worth the effort to include open-ended comment opportunities in an employee survey.

In the end, employees want to speak their minds about the workplace. Employers can gain insight into the workplace climate by using open-ended data collection. Sampling and content coding tools can help researchers and organizational leaders gain the rich insights provided by open-ended comments without being overwhelmed by them. Another simple method of keeping write-in comments at a manageable level is to restrict the length allowed (for example, to 10 or 20 lines of text).

Regardless of methods used and the questions asked, it is critical for the integrity of the survey process to be clear about whether the survey is anonymous, who will see the results, and how the results will be used. With regard to open-ended comments, it should be clear who will see the comments and whether they will be edited (e.g., to remove proper names or foul language). Transparency about the purpose and methods used in a survey project or program will go a long way toward protecting the validity and utility of employee attitude measurements.

References:

  1. Denzin, N. K., & Lincoln, Y. S. (Eds.). (2005). The Sage handbook of qualitative research (3rd ed.). Thousand Oaks, CA: Sage.
  2. Fink, A. (2005). How to conduct surveys: A step-by-step guide (3rd ed.). Thousand Oaks, CA: Sage.
  3. Kraut, A. I. (Ed.). (2005, April). Grappling with write-in comments in a web-enabled survey world. Practitioner forum conducted at the 20th Annual Meeting of the Society for Industrial and Organizational Psychology, Los Angeles, CA.
  4. Rea, L. M., & Parker, R. A. (2005). Designing and conducting survey research: A comprehensive guide. San Francisco: Jossey-Bass.