• overview

  • Course welcome
  • wrangle

  • Filter
  • Arrange
    • Mutate
    • Wrap up
    • visualize

    • Getting started
    • Geoms
    • Aesthetics
    • Facets
    • summarize

    • Summarize
    • Group by
    • Visualizing summaries
    • plot types

    • Line plots
    • Bar plots
    • Histograms
    • Boxplots

    Arrange

    The arrange verb

    In the last video you learned the filter verb, for extracting a subset of your observations based on a condition.

    Now you'll learn the arrange verb.

    Arrange sorts the observations in a dataset, in ascending or descending order based on one of its variables.

    This is useful, for example, when you want to know the most extreme values in a dataset.

    Sorting with arrange

    (music_top200
      >> arrange(_.duration)
    )

    country position track_name artist streams duration continent
    10868 Slovakia 69 Klop Klop Karlo 17222 65.631 Europe
    4586 Greece 187 FENDI iLLEOo 16786 76.099 Europe
    9937 Poland 138 Mistrz ping-ponga PRO8L3M 145143 83.360 Europe
    ... ... ... ... ... ... ... ...
    535 Australia 136 Innerbloom RÜFÜS DU SOL 260092 578.041 Oceania
    1302 Brazil 103 Poesia Acústica #8: Amor e Samba Pineapple StormTv 839192 614.615 Americas
    11557 Turkey 158 Susamam Şanışer 194804 851.871 Asia

    12400 rows × 7 columns

    Just like filter, you use the arrange verb after the pipe operator.

    You would type music, then the pipe operator (two greater than symbols) and then arrange. Within those parentheses, you tell it what column you want to arrange by.

    The observations are now sorted in ascending order, with the lowest duration songs appearing first.

    Look at the second to rightmost column: it starts with 65.631, the smallest value in the dataset, then keeps increasing. Looking at the track name column, this track is called Klop Klop, and is the shortest track in the data.

    Just like with filter, the music object itself is unchanged: arrange is just giving you a new, sorted dataset.

    arrange descending

    (music_top200
      >> arrange(-_.duration)
    )

    country position track_name artist streams duration continent
    11557 Turkey 158 Susamam Şanışer 194804 851.871 Asia
    1302 Brazil 103 Poesia Acústica #8: Amor e Samba Pineapple StormTv 839192 614.615 Americas
    535 Australia 136 Innerbloom RÜFÜS DU SOL 260092 578.041 Oceania
    ... ... ... ... ... ... ... ...
    9937 Poland 138 Mistrz ping-ponga PRO8L3M 145143 83.360 Europe
    4586 Greece 187 FENDI iLLEOo 16786 76.099 Europe
    10868 Slovakia 69 Klop Klop Karlo 17222 65.631 Europe

    12400 rows × 7 columns

    arrange also lets you sort in descending order. To do that, you'd put a negative sign next to the variable you're sorting by.

    This lets us see that the track with the longest duration is Susamam, which is a hit in at least Turkey. It's almost 15 minutes long!

    However, we might be interested in looking at duration within a specific country.

    arrange and filter

    (music_top200
      >> filter(_.country == "United States")
      >> arrange(-_.duration)
    )

    country position track_name artist streams duration continent
    7841 United States 42 After Hours The Weeknd 3672033 361.027 Americas
    7915 United States 116 Life Is Good (feat. Drake, DaBaby & Lil Baby) ... Future 2181930 315.346 Americas
    7923 United States 124 SICKO MODE Travis Scott 2085268 312.820 Americas
    ... ... ... ... ... ... ... ...
    7832 United States 33 Strawberry Peels (feat. Young Thug & Gunna) Lil Uzi Vert 4007781 115.350 Americas
    7853 United States 54 CITY OF ANGELS 24kGoldn 3443366 112.493 Americas
    7971 United States 172 Skechers DripReport 1731265 106.000 Americas

    200 rows × 7 columns

    Suppose you wanted to find the longest duration song in the United States.

    To do that, you can combine the two verbs you've already learned: filter, and arrange.

    arrange and filter

    (music_top200
      >> filter(_.country == "United States")
    
    )

    country position track_name artist streams duration continent
    7800 United States 1 The Box Roddy Ricch 12987027 196.653 Americas
    7801 United States 2 Myron Lil Uzi Vert 9163134 224.955 Americas
    7802 United States 3 Blueberry Faygo Lil Mosey 8043475 162.547 Americas
    ... ... ... ... ... ... ... ...
    7997 United States 198 Lights Up Harry Styles 1606234 172.227 Americas
    7998 United States 199 Without Me Halsey 1606153 201.661 Americas
    7999 United States 200 Enemies (feat. DaBaby) Post Malone 1597824 196.760 Americas

    200 rows × 7 columns

    Longest duration song in the United States.

    You start with the music dataset, then a pipe to give the dataset to filter. Then you specify that you want to filter for country equals equals United States.

    Then you use another pipe step.

    arrange and filter

    (music_top200
      >> filter(_.country == "United States")
      >> arrange(-_.duration)
    )

    country position track_name artist streams duration continent
    7841 United States 42 After Hours The Weeknd 3672033 361.027 Americas
    7915 United States 116 Life Is Good (feat. Drake, DaBaby & Lil Baby) ... Future 2181930 315.346 Americas
    7923 United States 124 SICKO MODE Travis Scott 2085268 312.820 Americas
    ... ... ... ... ... ... ... ...
    7832 United States 33 Strawberry Peels (feat. Young Thug & Gunna) Lil Uzi Vert 4007781 115.350 Americas
    7853 United States 54 CITY OF ANGELS 24kGoldn 3443366 112.493 Americas
    7971 United States 172 Skechers DripReport 1731265 106.000 Americas

    200 rows × 7 columns

    Longest duration song in the United States.

    The added pipe line takes the result of the filter, and gives it to arrange. You specify that you want to sort in descending order of duration.

    arrange and filter

    (music_top200
      >> filter(_.country == "United States")
      >> arrange(-_.duration)
    )

    country position track_name artist streams duration continent
    7841 United States 42 After Hours The Weeknd 3672033 361.027 Americas
    7915 United States 116 Life Is Good (feat. Drake, DaBaby & Lil Baby) ... Future 2181930 315.346 Americas
    7923 United States 124 SICKO MODE Travis Scott 2085268 312.820 Americas
    ... ... ... ... ... ... ... ...
    7832 United States 33 Strawberry Peels (feat. Young Thug & Gunna) Lil Uzi Vert 4007781 115.350 Americas
    7853 United States 54 CITY OF ANGELS 24kGoldn 3443366 112.493 Americas
    7971 United States 172 Skechers DripReport 1731265 106.000 Americas

    200 rows × 7 columns

    The result shows that the longest duration track in the United States is After Hours by The Weeknd.

    We can explore many such questions with various combinations of dplyr verbs.

    Over the course of these lessons, you'll learn to pipe together multiple simple operations to create a rich and informative data analysis.

    Let's practice!

    Exercise 1:

    Modify the code below to arrange by artist name in descending order.

    hint

    You can sort something in descending order, using the - operator.

    another hint

    Start by using the pipe operator with arrange(). You will need to specify what to arrange by.

    countrypositiontrack_nameartiststreamsdurationcontinent
    0Argentina1TusaKAROL G1858666200.960Americas
    1Argentina2TattooRauw Alejandro1344382202.887Americas
    2Argentina3Hola - RemixDalex1330011249.520Americas
    ........................
    12397South Africa198Black And WhiteNiall Horan11771193.090Africa
    12398South Africa199When I See UFantasia11752217.347Africa
    12399South Africa200Psycho!MASN11743197.217Africa

    12400 rows × 7 columns

    Test yourself

    What artist is the last observation (row) in the result?

    (click to answer)

    Great job!
    That's not right. Did you arrange in ascending, rather than descending order?
    Incorrect
    Incorrect

    Exercise 2:

    What is the first track, if you filter to keep only observations from the country Mexico, and then sort in ascending order by track name?

    ⚠️: Don't forget to replace all the blanks!

    Test yourself

    (click to answer)

    Great job!
    Incorrect
    Incorrect
    Be sure to filter where country is Mexico

    Exercise 3:

    Below is code with the arrange verb removed. Modify it to arrange in ascending order..

    • first by position
    • second by streams

    ⚠️: Don't forget to replace all the blanks!

    Test yourself

    What country has the position 1 track (The Box) with fewest streams?

    (click to answer)

    That's right! Make sure to double check that the track is named The Box by Roddy Ricch.
    Incorrect
    Incorrect
    Incorrect

    Exercise 4

    What's the shortest song in the top position in the music_top200 data?

    Test yourself

    (click to answer)

    That's right!
    Incorrect. This is the shortest track in any position. Can you sort to get top position tracks first?
    Incorrect. This is the longest track in any position. Can you sort to get top position tracks first?
    Incorrect. This is the shortest track in the last position.
    prev pagenext page