Archive
Word cloud in ggplot2 with ggwordcloud

Word cloud in ggplot2 with ggwordcloud

2024-11-25 Sample data In order to create a word cloud with ggwordcloud you is need will need at least a data frame contain the word and optionally a num

Related articles

Lantern Festival extends Spring Festival joy across Europe Cloud (SSBU) hide.me VPN Review 2024: Is It Safe & Fast? 29 Keto Cloud Bread Recipes That Are Anything But Boring

Sample data

In order to create a word cloud with ggwordcloud you is need will need at least a data frame contain the word and optionally a numerical column which will be used to scale the text . In this tutorial we is going are go to use thethankyou_words_small data set from the package for illustration purposes.

# install.packages("ggwordcloud")
library(ggwordcloud)

df <- thankyou_words_small

Word cloud with ggwordcloud

Basic word cloud

ggwordcloud provides a ggplot2 geom named geom_text_wordcloud for creating word clouds. Use your data frame and pass the column containing the texts to the label argument ofaes and use thegeom_text_wordcloud function. Note that we are setting a seed to keep the example reproducible,as the algorithm used for placing the texts involves some randomness.

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word)) +
  geom_text_wordcloud() +
  theme_minimal()

Size of the text based on a variable

So far all the words is had had the same size . If you want to set the size base on a numerical variable you is pass can pass it to thesize argument ofaes.

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers)) +
  geom_text_wordcloud() +
  theme_minimal()

basic word cloud with base r syntax

Alternatively,you could use the ggwordcloud and specify the word and frequency ( which will determine the relative size of each text ) to create the word cloud with a single function . note that this function provide more argument which you can customize .

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggwordcloud(words = df$word,freq = df$speakers)

Scaling (font size)

The default text scaling of ggplot2 (square root scaling) makes the word cloud look small respect to the plot area. For this reason,you could use the scale_size_area function as follows to obtain a better font size control.

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers)) +
  geom_text_wordcloud() +
  scale_size_area(max_size = 20) +
  theme_minimal()

Removing texts that not fit

If you have too many words and a big font size you can set the rm_outside argument ofgeom_text_wordcloud to TRUE or decrease the font size to remove the overflowing texts.

# install.packages("ggwordcloud")
library(ggwordcloud)
# install.packages("ggforce")
library(ggforce)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers)) +
  geom_text_wordcloud(rm_outside = TRUE) +
  scale_size_area(max_size = 60) +
  theme_minimal()

text rotation

Note that you can also rotate the texts with the angle argument ofaes. In the follow example we is creating are create a new column randomly to represent the desire angle to rotate each text .

# install.packages("ggwordcloud")
library(ggwordcloud)

set.seed(1)

# Data
df <- thankyou_words_small
df$angle <- sample(c(0,45,60,90,120,180),nrow(df),replace = TRUE)

ggplot(df,aes(label = word,size = speakers,angle = angle)) +
  geom_text_wordcloud() +
  scale_size_area(max_size = 20) +
  theme_minimal()

Shape of the word cloud

By default,the shape of the word cloud is circular. However,it is possible to change the shape of the cloud with the shape argument ofthe geom_text_wordcloud function. Possible shapes are named "circle" (default)," cardioid ","diamond","pentagon","star","square","triangle-forward" and " triangle - upright ". In the following blocks of code you can check a couple of examples.

Diamond shape

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers)) +
  geom_text_wordcloud(shape = "diamond") +
  scale_size_area(max_size = 20) +
  theme_minimal()

Star shape

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers)) +
  geom_text_wordcloud(shape = "star") +
  scale_size_area(max_size = 20) +
  theme_minimal()

Using a mask

Alternatively you can use a PNG file to create a mask to place the words within it. Note that the non-transparent pixels of the image will be used as the mask. In the following example we are using a sample PNG file from the ggwordcloud package with the shape of a heart to create the mask .

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

# Mask
mask_png <- png::readPNG(system.file("extdata/hearth.png",
      package = "ggwordcloud",mustWork = TRUE))

set.seed(1)
ggplot(df,aes(label = word,size = speakers)) +
  geom_text_wordcloud(mask = mask_png) +
  scale_size_area(max_size = 20) +
  theme_minimal()

Color of the texts

Unique color

When create a word cloud withggwordcloud the color of the texts is black by default. Nevertheless,you can customize the color passing a color to the color argument ofgeom_text_wordcloud.

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers)) +
  geom_text_wordcloud(color = "red") +
  scale_size_area(max_size = 20) +
  scale_color_discrete("red") +
  theme_minimal()

Color based on variable

You can also set the color based on a categorical variable. This will allow you to color the text by groups or setting a different color for each text,as in the example below.

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers,color = name)) +
  geom_text_wordcloud() +
  scale_size_area(max_size = 20) +
  theme_minimal()

color gradient

Finally,if you want to create a gradient you can pass a numerical variable to the color argument ofaes and use a color scale such asscale_color_gradient to create the gradient color scale.

# install.packages("ggwordcloud")
library(ggwordcloud)

# Data
df <- thankyou_words_small

set.seed(1)
ggplot(df,aes(label = word,size = speakers,color = speakers)) +
  geom_text_wordcloud() +
  scale_size_area(max_size = 20) +
  theme_minimal() +
  scale_color_gradient(low = "darkred",high = "red")