Code
library(data.table)
library(reactable)
= fread("../inst/data/diversity1.txt")
df
reactable(
df, theme = reactableTheme(
backgroundColor = "transparent"
) )
Diversity estimation within immune cell populations is a fundamental analysis in immunoinformatics. Diversity estimation refers to the process of analyzing and quantifying the diversity of immune cells, such as T cells or B cells.
The repDiversity function offers various approaches for estimating repertoire diversity. The method parameter allows users to specify the means for diversity estimation. Users can select one of the following methods to set the means for diversity estimation:
This is an example of a data format containing the necessary information for diversity estimation visualizations. The table includes mock data specifically generated for this purpose.
The “Sample” column contains unique identifiers for each sample, while the “Group” column indicates the different groups to which the samples belong. The “Patient” column provides information about the respective patients associated with each sample. The “Shannon” column corresponds to the Shannon index utilized for the diversity analysis.
Boxplot is one of the most commonly used chart types for comparing the distribution of a numeric variable across multiple groups.
In a boxplot, the central line within the box represents the median of the data, dividing it into two equal parts. The edges of the box represent the upper and lower quartiles, while the extreme lines extend to the the highest and lowest values within the data range, excluding outliers.
Boxplots are created using the geom_boxplot()
function from the ggplot2
package.
It’s important to note that while boxplots provide a summary of data distribution for each group, they may hide the underlying distribution details.
To address this concern, a common practice is to overlay individual data points using geom_point()
behind the boxplot, offering a clearer visualization of the dataset’s distribution.
# libraries -------
library(ggplot2)
library(ggforce)
library(paletteer)
# plot 1 ---------------------
df |>
ggplot(aes(Group, Shannon)) +
geom_point(
aes(fill = Group),
position = position_jitternormal(sd_y = 0, sd_x = .08),
shape = 21, size = 2, stroke = .15, color = "white"
) +
scale_fill_manual(values = paletteer_d("ggsci::hallmarks_light_cosmic")) +
geom_boxplot(width = .15, outlier.shape = NA) +
theme_minimal() +
theme(
legend.position = "none",
axis.line = element_line(linewidth = .55),
axis.ticks = element_line(linewidth = .55),
panel.grid.major = element_line(linewidth = .55),
panel.grid.minor = element_line(linewidth = .45, linetype = "dashed"),
plot.background = element_rect(fill = "transparent", color = NA),
plot.margin = margin(20, 20, 20, 20)
)
The box plot illustrates the Shannon index on the y-axis with distinct patient groups depicted along the x-axis. Different colors are employed to distinguish between the two patient groups.
# libraries ------------
library(ggnewscale)
library(colorspace)
# plot 2 ------------
df |>
ggplot(aes(Group, Shannon)) +
geom_point(
aes(fill = Patient),
position = position_jitterdodge(jitter.width = .15, dodge.width = .5),
shape = 21, size = 2, stroke = .15, color = "white"
) +
scale_fill_manual(values = paletteer_d("ggsci::hallmarks_light_cosmic") |> lighten(.25)) +
geom_boxplot(
aes(color = Patient),
position = position_dodge(width = .5),
width = .2, outlier.shape = NA
) +
scale_color_manual(values = paletteer_d("ggsci::hallmarks_light_cosmic") |> darken(.25)) +
theme_minimal() +
theme(
legend.position = "bottom",
legend.justification = "left",
legend.title = element_blank(),
axis.line = element_line(linewidth = .55),
axis.ticks = element_line(linewidth = .55),
panel.grid.major = element_line(linewidth = .55),
panel.grid.minor = element_line(linewidth = .45, linetype = "dashed"),
plot.background = element_rect(fill = "transparent", color = NA),
plot.margin = margin(20, 20, 20, 20)
)