How to do a content analysis

Content analysis illustration

What is content analysis?

In research, content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. Simply put, content analysis is a research method that aims to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data, depending on the specific use case.

As such, some of the objectives of content analysis include:

  • Simplifying complex, unstructured content.
  • Identifying trends, patterns, and relationships in the content.
  • Determining the characteristics of the content.
  • Identifying the intentions of individuals through the analysis of the content.
  • Identifying the implied aspects in the content.

Typically, when doing a content analysis, you’ll gather data not only from written text sources like newspapers, books, journals, and magazines but also from a variety of other oral and visual sources of content like:

  • Voice recordings, speeches, and interviews.
  • Web content, blogs, and social media content.
  • Films, videos, and photographs.

One of content analysis’s distinguishing features is that you'll be able to gather data for research without physically gathering data from participants. In other words, when doing a content analysis, you don't need to interact with people directly.

The process of doing a content analysis usually involves categorizing or coding concepts, words, and themes within the content and analyzing the results. We’ll look at the process in more detail below.

Why would you use a content analysis?

Typically, you’ll use content analysis when you want to:

  • Identify the intentions, communication trends, or communication patterns of an individual, a group of people, or even an institution.
  • Analyze and describe the behavioral and attitudinal responses of individuals to communications.
  • Determine the emotional or psychological state of an individual or a group of people.
  • Analyze the international differences in communication content.
  • Analyzing audience responses to content.

Keep in mind, though, that these are just some examples of use cases where a content analysis might be appropriate and there are many others.

The key thing to remember is that content analysis will help you quantify the occurrence of specific words, phrases, themes, and concepts in content. Moreover, it can also be used when you want to make qualitative inferences out of the data by analyzing the semantic meanings and interrelationships between words, themes, and concepts.

Types of content analysis

In general, there are two types of content analysis: conceptual and relational analysis. Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions. With that in mind, let’s now look at these two types of content analysis in more detail.

Conceptual content analysis

With conceptual analysis, you’ll determine the existence of certain concepts within the content and identify their frequency. In other words, conceptual analysis involves the number of times a specific concept appears in the content.

Conceptual analysis is typically focused on explicit data, which means you’ll focus your analysis on a specific concept to identify its presence in the content and determine its frequency.

However, when conducting a content analysis, you can also use implicit data. This approach is more involved, complicated, and requires the use of a dictionary, contextual translation rules, or a combination of both.

No matter what type you use, conceptual analysis brings an element of quantitive analysis into a qualitative approach to research.

Relational content analysis

Relational content analysis takes conceptual analysis a step further. So, while the process starts in the same way by identifying concepts in content, it doesn’t focus on finding the frequency of these concepts, but rather on the relationships between the concepts, the context in which they appear in the content, and their interrelationships.

Before starting with a relational analysis, you’ll first need to decide on which subcategory of relational analysis you’ll use:

  • Affect extraction: With this relational content analysis approach, you’ll evaluate concepts based on their emotional attributes. You’ll typically assess these emotions on a rating scale with higher values assigned to positive emotions and lower values to negative ones. In turn, this allows you to capture the emotions of the writer or speaker at the time the content is created. The main difficulty with this approach is that emotions can differ over time and across populations.
  • Proximity analysis: With this approach, you’ll identify concepts as in conceptual analysis, but you’ll evaluate the way in which they occur together in the content. In other words, proximity analysis allows you to analyze the relationship between concepts and derive a concept matrix from which you’ll be able to develop meaning. Proximity analysis is typically used when you want to extract facts from the content rather than contextual, emotional, or cultural factors.
  • Cognitive mapping: Finally, cognitive mapping can be used with affect extraction or proximity analysis. It’s a visualization technique that allows you to create a model that represents the overall meaning of content and presents it as a graphic map of the relationships between concepts. As such, it’s also commonly used when analyzing the changes in meanings, definitions, and terms over time.

Reliability and Validity

Now that we’ve seen what content analysis is and looked at the different types of content analysis, it’s important to understand how reliable it is as a research method. We’ll also look at what criteria impact the validity of a content analysis.


There are three criteria that determine the reliability of a content analysis:

  • Stability. Stability refers to the tendency of coders to consistently categorize or code the same data in the same way over time.
  • Reproducibility. This criterion refers to the tendency of coders to classify categories membership in the same way.
  • Accuracy. Accuracy refers to the extent to which the classification of content corresponds to a specific standard.

Keep in mind, though, that because you’ll need to code or categorize the concepts you’ll aim to identify and analyze manually, you’ll never be able to eliminate human error. However, you’ll be able to minimize it.


In turn, three criteria determine the validity of a content analysis:

  • Closeness of categories. This is achieved by using multiple classifiers to get an agreed-upon definition for a specific category by using either implicit variables or synonyms. In this way, the category can be broadened to include more relevant data.
  • Conclusions. Here, it’s crucial to decide what level of implication will be allowable. In other words, it’s important to consider whether the conclusions are valid based on the data or whether they can be explained using some other phenomena.
  • Generalizability of the results of the analysis to a theory. Generalizability comes down to how you determine your categories as mentioned above and how reliable those categories are. In turn, this relies on how accurately the categories are at measuring the concepts or ideas that you’re looking to measure.

The advantages and disadvantages of content analysis

Considering everything mentioned above, there are definite advantages and disadvantages when it comes to content analysis:

Advantages and disadvantages of content analysis

It doesn’t require physical interaction with any participant, or, in other words, it’s unobtrusive. This means that the presence of a researcher is unlikely to influence the results. As a result, there are also fewer ethical concerns compared to some other analysis methods.

It always involves an element of subjective interpretation. In many cases, it’s criticized for being too subjective and not scientifically rigorous enough. Fortunately, when applying the criteria of reliability and validity, researchers can produce accurate results with content analysis.

It uses a systematic and transparent approach to gathering data. When done correctly, content analysis is easily repeatable by other researchers, which, in turn, leads to more reliable results.

It’s inherently reductive. In other words, by focusing only on specific concepts, words, or themes, researchers will often disregard any context, nuances, or deeper meaning to the content.

Because researchers are able to conduct content analysis in any location, at any time, and at a lower cost compared to many other analysis methods, it’s typically more flexible.

Although it offers researchers an inexpensive and flexible approach to gathering and analyzing data, coding or categorizing a large number of concepts is time-consuming.

It allows researchers to effectively combine quantitative and qualitative analysis into one approach, which then results in a more rigorous scientific analysis of the data.

Coding can be challenging to automate, which means the process largely relies on manual processes.

A step-by-step guide to conducting a content analysis

Let’s now look at the steps you’ll need to follow when doing a content analysis.

Step 1: Develop your research questions

The first step will always be to formulate your research questions. This is simply because, without clear and defined research questions, you won’t know what question to answer and, by implication, won’t be able to code your concepts.

Step 2: Choose the content you’ll analyze

Based on your research questions, you’ll then need to decide what content you’ll analyze. Here, you’ll use three factors to find the right content:

  • The type of content. Here you’ll need to consider the various types of content you’ll use and their medium like, for example, blog posts, social media, newspapers, or online articles.
  • What criteria you’ll use for inclusion. Here you’ll decide what criteria you’ll use to include content. This can, for instance, be the mentioning of a certain event or advertising a specific product.
  • Your parameters. Here, you’ll decide what content you’ll include based on specified parameters in terms of date and location.

Step 3: Identify your biases

The next step is to consider your own pre-conception of the questions and identify your biases. This process is referred to as bracketing and allows you to be aware of your biases before you start your research with the result that they’ll be less likely to influence the analysis.

Step 4: Define the units and categories of coding

Your next step would be to define the units of meaning that you’ll code. This will, for example, be the number of times a concept appears in the content or the treatment of concept, words, or themes in the content. You’ll then need to define the set of categories you’ll use for coding which can be either objective or more conceptual.

Step 5: Develop a coding scheme

Based on the above, you’ll then organize the units of meaning into your defined categories. Apart from this, your coding scheme will also determine how you’ll analyze the data.

Step 6: Code the content

The next step is to code the content. During this process, you’ll work through the content and record the data according to your coding scheme. It’s also here where conceptual and relational analysis starts to deviate in relation to the process you’ll need to follow.

As mentioned earlier, conceptual analysis aims to identify the number of times a specific concept, idea, word, or phrase appears in the content. So, here, you’ll need to decide what level of analysis you’ll implement.

In contrast, with relational analysis, you’ll need to decide what type of relational analysis you’ll use. So, you’ll need to determine whether you’ll use affect extraction, proximity analysis, cognitive mapping, or a combination of these approaches.

Step 7: Analyze the Results

Once you’ve coded the data, you’ll be able to analyze it and draw conclusions from the data based on your research questions.

In Closing

Content analysis offers an inexpensive and flexible way to identify trends and patterns in communication content. In addition, it’s unobtrusive which eliminates many ethical concerns and inaccuracies in research data. However, to be most effective, a content analysis must be planned and used carefully in order to ensure reliability and validity.

Frequently Asked Questions about content analysis

✨ What are the two types of content analysis?

The two general types of content analysis: conceptual and relational analysis. Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions.

🔬 What is coding in content analysis?

In qualitative research coding means categorizing concepts, words, and themes within your content to create a basis for analyzing the results. While coding, you work through the content and record the data according to your coding scheme.

📊 What is the goal of content analysis?

Content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. The goal of a content analysis is to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data, depending on the specific use case.

🎻 Who uses content analysis?

Content analysis is a qualitative method of data analysis and can be used in many different fields. It is particularly popular in the social sciences.

💥 Can you do a content analysis without coding?

It is possible to do qualitative analysis without coding, but content analysis as a method of qualitative analysis requires coding or categorizing data to then analyze it according to your coding scheme in the next step.