Cumulative MoMA Painting Acquisitions by Artist Gender
From Lab 2
Published
April 14, 2026
How many paintings did the Museum of Modern Art (MoMA) in New York City cumulatively acquire between 1930-2017? Were there differences in overall acquisitions between male and female artists? Were there any gender-related differences in acquisition rates?
Code
# Make our new plotartworks_long |>ggplot(aes(x = year_acquired,y = total_by_gender,fill = artist_gender )) +# Use `geom_area()` and adjust position so the groups don't# "stack" on top of each othergeom_area(color ="black",position ="identity" ) +# Adjust the theme and legend propertiestheme_bw() +theme(legend.position ="inside",legend.position.inside =c(0.2,0.75),legend.background =element_rect(color ="black"),axis.text =element_text(size =10),axis.title =element_text(size =12) ) +# Fix our labelslabs(title ="Cumulative Number of Paintings Acquired Over Time by Artist Gender",subtitle ="(MoMA, 1930-2017)",x ="Year",y ="Number of Acquired Works",fill ="Artist Gender" )
Data
These data come from a record set of inventoried MoMA artworks hosted on GitHub. The data have been cleaned for use by our BMI 525 cohort and filtered to include only paintings. Rows in the data set represent individual artworks, with associated variables for information like title, artist, year_acquired, height_cm, width_cm, etc.
Audience
This particular plot is intended for a broad audience of both scientists and non-scientists alike (e.g., the general readership of a magazine or blog about museums).
Graph Type
This plot is an an example of an area plot, which is a type of line plot where the area under a given line is filled in with a particular color. Of note, area plots with multiple categories (e.g., two genders) can show either absolute or “stacked” values. So-called “stacked” area plots get their name because they “stack” the areas of multiple groups on top of one another, such that the upper-most line will show the total of all categories together and the color blocks will show the proportions of categories. The plot above, however, only shows absolute values for each group, so the two color blocks can be considered independently of one another.
Representation Description
The plot shows the cumulative number of paintings (y-axis) acquired by the MoMA from male and female artists (indicated by color/fill), from 1930 to 2017 (left to right on the x-axis). The shaded regions for male and female artists are overlaid and indicate independent group totals. We see that the total number of works by both male and female artists have increased over time, though it appears the rate of acquisition is steeper for male artists than for female artists, even up to 2017, suggesting that the museum has historically acquired paintings by male artists at a faster rate than paintings by female artists.
Tips for Interpretation
Start by looking at the x- and y-axes to note the time-span and the range of acquired paintings, respectively. Look at the color blocks one at a time: how does the overall area of the female artist block compare to the area of the male artist block? Now, compare the slopes of the the two shapes (i.e., the steepness of the top edge as we move left to right along the x-axis). Does one color block have a steeper slope than the other? Which one? Are there any fluctuations in these slopes over the years?
Presentation Considerations
The default color set has been used to fill in the areas for the male and female gender groups, since the basic two-category scenario in the present case is easily handled by the default palette. The areas were not “stacked,” since we want to know about total acquisitions within each gender group, not the proportion of the two groups, per se.
Method
Starting from the cleaned and filtered (paintings only) data, we need to first tabulate the cumulative sum of paintings by artist gender after arrange the works by year of acquisition. Afterwards, the frame needs to be pivoted longer.
Code
# Create new variables `total_(fe)male_artists` and pivot longerartworks_long <- artworks_clnd2 |># Filter out `NA` valuesfilter(!is.na(artist_gender),!is.na(year_acquired) ) |># Sort by year of acquisitionarrange(year_acquired) |># Calculate cumulative sum of works by gendermutate(total_female_artists =cumsum(n_female_artists),total_male_artists =cumsum(n_male_artists) ) |># Reduce to relevant variablesselect( year_acquired, total_female_artists, total_male_artists ) |># Pivot longer and rename columnspivot_longer(cols =starts_with("total_"),names_to ="artist_gender",names_pattern ="total_(.*)_artists",values_to ="total_by_gender" ) |># Adjust the order of the factors to help with plotting in the next stepmutate(artist_gender =factor( artist_gender,levels =c("male", "female") ) )
After wrangling the data, the figure can be constructed with geom_area() and set position = "identity" to prevent stacking.
Code
# Initiate the plotartworks_long |>ggplot(aes(x = year_acquired,y = total_by_gender,fill = artist_gender )) +# Use `geom_area()` and adjust position so the groups don't# "stack" on top of each othergeom_area(position ="identity" )