• overview

  • Course welcome
  • wrangle

  • Filter
  • Arrange
  • Mutate
  • Wrap up
  • visualize

  • Getting started
  • Geoms
  • Aesthetics
    • Facets
    • summarize

    • Summarize
    • Group by
    • Visualizing summaries
    • plot types

    • Line plots
    • Bar plots
    • Histograms
    • Boxplots

    Aesthetics

    Using plotnine Aesthetics

    In this lesson, you'll use plotnine aesthetics to set more than just the position of points on a plot, but other features too--such as size and color.

    Scatterplots

    (billie
      >> ggplot(aes("energy", "valence"))
       + geom_point()
    )

    png

    You've learned how to create a scatter plot to compare two variables within your data using two visual aesthetics: energy x-axis, and valence on the y-axis.

    Additional variables

    billie

    artist album track_name energy valence danceability speechiness acousticness popularity duration
    1273 Billie Eilish dont smile ... my boy 0.3940 0.3240 0.692 0.2070 0.472 44 170.852
    2899 Billie Eilish WHEN WE ALL... listen befo... 0.0561 0.0820 0.319 0.0450 0.935 79 242.652
    2950 Billie Eilish lovely (wit... lovely (wit... 0.2960 0.1200 0.351 0.0333 0.934 89 200.186
    ... ... ... ... ... ... ... ... ... ... ...
    24857 Billie Eilish WHEN WE ALL... ilomilo 0.4230 0.5720 0.855 0.0585 0.724 79 156.371
    24997 Billie Eilish WHEN I WAS ... WHEN I WAS ... 0.3320 0.0628 0.696 0.0425 0.853 71 270.520
    25147 Billie Eilish come out an... come out an... 0.3210 0.1770 0.640 0.0931 0.693 74 210.376

    27 rows × 10 columns

    But these aren't the only variables in the track_features dataset: for example, you also have acousticness and popularity. You may want to examine relationships among all these variables in the same plot.

    You already used the x-axis to represent energy and the y-axis to valence Now you'll learn to add two more aesthetics--color and size--to communicate even more information in your scatter plot.

    The color aesthetic

    (billie
      >> ggplot(aes("energy", "valence", color = "acousticness"))
       + geom_point()
    )

    png

    The scatterplot shows that songs with higher energy, tend to have higher valence. Another variable that might be related to energy is acousticness.

    You can explore this relationship by setting the color of your points, like you see here. To use this aesthetic, you add color equals acousticness inside the aes, next to x equals energy and y equals valence.

    Notice that ggplot2 automatically adds a legend to the plot, indicating which color represents which acousticness.

    This communicates that lower energy tracks (toward the left of the plot) tend to be more acoustic. Note that brighter colors indicate more acoustic tracks.

    The size aesthetic

    (billie
      >> ggplot(aes("energy", "valence", color = "acousticness", size = "popularity"))
       + geom_point()
    )

    png

    Another variable you may want to include in the graph is population, represented by the pop variable in the dataset.

    One way to represent it is with the size of the points in the scatterplot, with higher popularity songs getting larger points.

    Just like x, y, and color, you add size = "popularity" within the aes parentheses.

    Note that to keep the length of each of the code lines reasonable, we put the size aesthetic on a second line, but this doesn't make any difference, and you don't have to do that in the exercises. You've now learned to use four aesthetics in a plot: x, y, color, and size: to communicate information about four variables in your dataset.

    Aesthetics with multiple geoms

    (billie
      >> ggplot(aes("energy", "valence", 
                    color = "acousticness", size = "popularity",
                    label = "track_name"))
       + geom_point()
       + geom_text(nudge_y = .1)
    )

    png

    Notice that in this plot, the aesthetics set effect both the points and the text.

    Let's practice!

    In the exercises, you'll learn to mix and match aesthetics and variables to further explore the track features.

    The plot below shows all top 200 hundred hits for Eric Chou across countries. Use the code cell below to recreate it.

    (Note: running the code won't delete the plot!).

    png

    Exercise 2:

    Use plots of the data for the artists Snelle, Bazzi, and Davyi, to answer the questions below.

    You may need to write and run code multiple times, and produce multiple plots.

    ()
    
    Test yourself

    Which of these artists have hit tracks in the most continents?

    (click to answer)

    Incorrect. Did you try using the color aesthetic?
    That's right. Bazzi has hits on every continent.
    Incorrect. Did you try using the color aesthetic?
    Test yourself

    How many *countries* does Dayvi have hit tracks in?

    (click to answer)

    Incorrect.
    That's right.
    Incorrect.
    prev pagenext page