Q: A data professional determines the best fit line by calculating the difference between observed values and the predicted value of a regression line. What is this calculation?
- Notion
- Coefficient
- Parameter
- Residual
Q: In linear regression, what mathematical technique is used to
calculate the best fit line?
- Coefficient of determination
- Sum of squared residuals
- Hold out coefficient
- Ordinary least squares
Q: A data professional testing for linear regression assumptions plots
their dependent variable against their independent variable and notices that
the graph appears as a repeating waveform. Which model assumption does this
invalidate?
- Independent observation
- Normality
- Linearity
- Homoscedasticity
Q: Fill in the blank: A scatterplot matrix is a series of scatterplots
that show the _____ between pairs of variables.
- distances
- discrepancies
- relationships
- variability
Q: A data professional at a toy manufacturer checks model assumptions
while working on a project about potential new game concepts. They find no
clear pattern in their scatterplot and can confirm constant variance along the
values of the dependent variable. What does this scenario describe?
- Independent observation
- Normality
- Linearity
- Homoscedasticity
When there is constant variance, the spread of the residuals around the regression line does not vary regardless of the values of the independent variable(s) that are being considered. The presence of a distinct pattern in the scatterplot, on the other hand, would be indicative of heteroscedasticity. This pattern may take the form of a funnel shape or a rising variance with increasing values of the independent variable. In addition to being a violation of the condition of homoscedasticity, heteroscedasticity has the potential to influence the validity and reliability of the findings obtained from the regression model.
Q: Fill in the blank: A confidence band is the area surrounding a line
that describes the uncertainty around the predicted outcome at every value of
_____.
- intercept
- X
- Slope
- Y
Q: What is another term for R squared?
- Residuals of determination
- Error of residuals
- Coefficient of determination
- Coefficient of residuals
Q: Which of the following statements accurately describe running a
randomized, controlled experiment? Select all that apply.
- It is a study design that systematically and methodically assigns participants into groups.
- The differences between the control and treatment groups must be observable and measurable.
- To be successful, data professionals must control for every factor in the experiment.
- It is typically used when arguing for causation between variables.
Q: Fill in the blank: _____ is the difference between observed values
and the predicted values of a regression line.
- Coefficient
- Residual
- Intercept
- Error
Q: A data professional minimizes the sum of squared residuals to
estimate parameters in a linear regression model. What method are they using?
- Residual coefficients
- Mean absolute error
- R squared
- Ordinary least squares
Q: A data analytics professional working for a storage facility checks
model assumptions while determining optimal storage space sizes. They notice
that the model’s residuals appear in a cone-shaped pattern when plotted against
the independent variable. Which model assumption does this invalidate?
- Normality
- Homoscedasticity
- Independent observation
- Linearity
Q: A data professional determines how much of the variation in the X
variable explains the variation in the Y variable. Which model evaluation
metric enables this determination?
- Mean absolute error (MAE)
- Mean squared error (MSE)
- P-value
- R squared
Q: Fill in the blank: A scatterplot _____ is a series of scatterplots
that show the relationships between pairs of variables.
- succession
- matrix
- array
- progression
Q: Which of the following statements accurately describe a randomized,
controlled experiment? Select all that apply.
- As the study is conducted, the only expected similarity between the control and experimental groups is the outcome variable being studied.
- The differences between the control and treatment groups must be observable and measurable.
- It is a study design that randomly assigns participants into an experimental group or a control group.
- To be successful, data professionals must control for every factor in the experiment.
Q: In linear regression, what mathematical technique is used to
calculate beta zero hat and beta one hat?
- Coefficient R squared
- Mean squared error
- Ordinary least squares
- Coefficient of determination
- models
- coordinates
- variables
- lines
Q: What is the difference between observed or actual values and the
predicted values of a regression line?
- Beta
- Slope
- Residual
- Parameter
Q: Fill in the blank: A _____ is the area surrounding a line that
describes the uncertainty around the predicted outcome at every value of X.
- confidence band
- confidence slope
- interval band
- interval slope
Q: What measures the proportion of variation in the dependent variable
Y explained by the independent variable X?
- R squared
- P-value
- Mean absolute error (MAE)
- Mean squared error (MSE)
Q: Fill in the blank: A scatterplot _____ is a series of scatterplots
that show the relationships between pairs of variables.
- succession
- array
- progression
- matrix
Q: Fill in the blank: A _____ is the area surrounding a line that
describes the uncertainty around the predicted outcome at every value of X.
- interval slope
- confidence band
- confidence slope
- interval band
Q: Fill in the blank: A confidence band is the area surrounding a line
that describes the _____ around the predicted outcome at every value of X.
- Uncertainty
- certainty
- accuracy
- inaccuracy
Q: What term describes the difference between observed or actual values
and the predicted values of the regression line?
- Residuals
- Best fit lines
- Ordinary least squares
- Predicted values