TDM 10100: Project 8 — 2022
Motivation: Functions are an important part of writing efficient code.
Functions allow us to repeat and reuse code. If you find you using a set of coding steps over and over, a function may be a good way to reduce your lines of code!
Context: We’ve been learning about and using functions these last few weeks.
To learn how to write your own functions we need to learn some of the terminology and components.
Scope: r, functions
Dataset(s)
We will use the same dataset(s) as last week:
-
/anvil/projects/tdm/data/movies_and_tv/titles.csv
-
/anvil/projects/tdm/data/movies_and_tv/episodes.csv
-
/anvil/projects/tdm/data/movies_and_tv/people.csv
-
/anvil/projects/tdm/data/movies_and_tv/ratings.csv
Please select 6000 memory when launching Jupyter for this project. |
Helpful Hints
fread
- is a fast and efficient way to read in data.
library(data.table)
titles <- data.frame(fread("/anvil/projects/tdm/data/movies_and_tv/titles.csv"))
episodes <- data.frame(fread("/anvil/projects/tdm/data/movies_and_tv/episodes.csv"))
people <- data.frame(fread("/anvil/projects/tdm/data/movies_and_tv/people.csv"))
ratings <- data.frame(fread("/anvil/projects/tdm/data/movies_and_tv/ratings.csv"))
Questions
Writing our own function to make a repetitive operation easier by turning it into a single command.
Take care to name the function something concise but meaningful so that others can understand what the function can be understood by other users.
Function parameters can also be called formal arguments.
Insider Knowledge
A function is an object that contains multiple interrelated statments put together in a predefined order when called(run).
Functions can be built-in or created by the user (user-defined).
-
min(), max(), mean(), median()
-
print()
-
head()
Helpful Hints
Syntax of a function
what_you_name_the_function <- function (parameters) {
statement(s) that are executed when the function runs
the last line of the function is the returned value
}
ONE
To gain a better insight into our data, let’s make two simple plots:
-
A grouped bar chart see an example here
-
A line plot see an example here
-
What information are you gaining from either of these graphs?
-
Code used to solve this problem.
-
Output from running the code.
TWO
For practice, now that you have a basic understanding of how to make a function, we will use that knowledge, applied to our dataset.
Here are pieces of a function we will use on this dataset; put them in the correct order
-
results ← merge(ratings_df, titles_df, by.x = "title_id", by.y = "title_id")
-
}
-
function(titles_df, ratings_df, ratings_of_at_least)
-
return(popular_movie_results)
-
{
-
popular_movie_results ← results[results$type == "movie" & results$rating >= ratings_of_at_least, ]
-
find_movie_with_at_least_rating ←
-
Code used to solve this problem.
-
Output from running the code.
THREE
Take the above function and add comments explaining what the function does at each step.
-
Code used to solve this problem.
-
Output from running the code.
FOUR
my_selection <- find_movie_with_at_least_rating(titles, ratings, 7.6)
Using the code above answer these questions.
-
How many movies in total are there, which are above that limit?
-
Change the limits in the function from "at least 5.0" to "lower than 5.0".
-
How many movies have ratings lower than 5.0?
-
Code used to solve this problem.
-
Output from running the code.
FIVE
Now create a function that takes a genre as the input and finds either
-
the movie from that genre that has the largest number of votes, OR
-
the movie from that genre that has the highest rating.
(You don’t need to do both. In the video, I discuss how to find the movie from that genre that has the highest rating.)
-
Code used to solve this problem.
-
Output from running the code.
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |