Q: Which of the following terms are used to describe missing data? Select all that apply.
- Zero
- Blank
- NaN
- N/A
Q: Stakeholders at a film studio hire a data analytics firm to provide
insights about the best locations for film shoots. However, the film studio’s
datasets contain missing data. Which of the following strategies can help the
data analytics firm solve this problem? Select all that apply.
- Use their best judgment to add in values themselves.
- Create a NaN category.
- Add in the missing values by taking the average values from the existing data.
- Ask the film studio to fill in the missing values.
Q: A data professional writes the following code:
df.merge(df_zip, how='left',
on=['date','center_point_geom'])
Which section of the code refers to the dataframe to be merged with df?
- df_zip
- how=’left’
- merge
- center_point_geom
Q: What pandas function is used to pull all of the missing values from a
data frame?
- pd.getnull()
- pd.ofnull()
- pd.findnull()
- pd.isnull()
Q: What type of outliers are values that are completely different from
the overall data group and have no association with any other outliers?
- Collective outliers
- Global outliers
- Contextual outliers
- Dissimilar outliers
Q: A data professional works for a car insurance company. To gain
insights about the popularity of electric vehicles, they study categorical data
about cars. They add a 0 to their dataset to indicate if a car is gas-powered
and a 1 if a car is electric. What does this scenario describe?
- Applying a variable character
- Changing a floating point
- Using dummy variables
- Removing a data operator
Q: What type of data visualization shows the concentration of values
between two data points by illustrating their magnitude with two colors?
- Heat map
- Treemap
- Scatter plot
- Density map
Q: What does the pandas function pd.duplicated() return to indicate that
a data value does not have a duplicate value within the same dataset?
- True
- Duplicate
- Unique
- False
Q: Fill in the blank: The pandas function _____ enables data
professionals to create a new dataframe with all duplicate rows removed.
- drop_duplicates()
- deduplicate()
- de_duplication()
- deduplication()
Q: Which of the following terms can be used to describe a value that is
not stored for a variable in a set of data? Select all that apply.
- Zero
- N/A
- NaN
- Blank
Q: A data professional writes the following code:
df.merge(df_zip, how='left',
on=['date','center_point_geom'])
Which of the following is a parameter for the merge?
- df_joined
- how=’left’
- df.merge()
- df.head()
Q: What tasks could the pandas function pd.isnull() be used for? Select
all that apply.
- To delete all of the values from a data frame
- To change all values to nulls in a data frame
- To identify when a value is missing from a data frame
- To pull all of the missing values from a data frame
Q: Fill in the blank: Contextual outliers are normal data points under
certain conditions but become _____ under most other conditions.
- Insignificant
- Samples
- Anomalies
- Standard
Q: A data professional works for a veterinary office. To gain insights
about the most common household pets, they study categorical data about pet
adoptions over the past five years. They assign the number 1 to dogs, 2 to
cats, 3 to hamsters, and so on. What does this scenario describe?
- Data blending
- Label encoding
- Data partitioning
- Aliasing
Q: Fill in the blank: A _____ is a data visualization that displays the
magnitude of a set of values using two colors to show the concentration of the
values.
- heat map
- bubble chart
- bar graph
- line chart
Q: Fill in the blank: A data professional should _____ a duplicate when
its value is clearly a mistake or will misrepresent the remaining unique values
within the dataset.
- Eliminate
- keep
- filter
- replicate
Q: Fill in the blank: N/A and NaN are terms used to describe _____
data.
- Missing
- nominal
- qualitative
- string
Q: What does the pandas function pd.duplicated() return to indicate
that a data value is a duplicate of another value within the same dataset?
- Duplicate
- Unique
- False
- True
Q: A data professional at a garden center researches data related to
ideal growing climates. As they familiarize themselves with the datasets, they
discover some data is missing. Which of the following strategies can help them
solve this problem? Select all that apply.
- Change the missing values to Boolean data that is either true or false.
- Create a NaN category.
- Derive new representative values based on available data.
- Add in the missing values by taking the average values from the existing data.
Q: What pandas function enables a data professional to determine if duplicate values are present in a dataset?
- pd.deduplication()
- pd.duplicated()
- pd.dupe()
- pd.deduplicates()
Q: A data team for an investment banker works on a project related to interest rates. As they familiarize themselves with the datasets, they discover some data is missing. Which of the following strategies can help them solve this problem? Select all that apply.
- Change the missing values to zeros.
- Ask the owner of the data to fill in the missing values.
- Derive new representative values based on available data.
- Add in the missing values by taking the average values from the existing data.