• overview

  • Course welcome
  • wrangle

  • Filter
  • Arrange
  • Mutate
  • Wrap up
    • visualize

    • Getting started
    • Geoms
    • Aesthetics
    • Facets
    • summarize

    • Summarize
    • Group by
    • Visualizing summaries
    • plot types

    • Line plots
    • Bar plots
    • Histograms
    • Boxplots

    Wrap up

    Putting it together

    This lesson shows what the beginning of an analysis might look like. Generally, data analysis is done in notebooks, like this one. In a notebook, you can alternate between blocks of code and narrative text.

    The first part of an analysis is often importing tools you will need for the analysis. For example, verbs like filter and mutate are imported from siuba.

    The imports for this analysis are shown below.

    # here we import verbs like filter, arrange, and mutate from siuba.
    # the import * means to import all of siuba's verbs.
    from siuba import *
    
    # here we import everything for plotting from plotnine (like ggplot())
    from plotnine import *
    
    # here we import the data for the course
    # note that rather than using * to get everything, you can name
    # specific things to import (like track_features)
    from music_top200 import music_top200, track_features
    

    Exercise 1:

    For the artist with the top track in Spain, what country has the most streams for one of their tracks?

    Note: you may need to write and run code multiple times.

    hint

    First, find the artist in the top position in Spain. After, can you get only that artists tracks? Once you do that you should be close!

    ()
    

    Exercise 2:

    Subset to keep only tracks in Hong Kong, then calculate a new column called stream_seconds, that's equal to streams times their duration.

    Once you've done that, try deleting the comments (#) in the code below to plot the data.

    ⚠️: Don't forget to replace all the blanks!

    prev pagenext page