Q: Which of the following statements correctly describe supervised and
unsupervised machine learning? Select all that apply.
- Unsupervised machine learning uses labeled datasets to train algorithms
to classify or predict outcomes.
- Supervised machine learning uses labeled datasets to train algorithms to
classify or predict outcomes.
- In unsupervised machine learning, data professionals ask the model to
give them information without telling the model what the answer should
be.
- Unsupervised machine learning involves data professionals asking a model
to give them information without specifying a desired outcome.
Explanation: The method can learn the link between the input characteristics and the target variable via the process of supervised learning, which involves the model learning from labeled data using the intended output as a guide. When dealing with unlabeled data, unsupervised learning is a kind of learning in which the model is charged with discovering patterns, structures, or connections within the data without being given explicit advice on what the output for the model should look like.
Q: Fill in the blank: The terms machine learning and _____ both refer to
training a computer to detect patterns in data without being explicitly
programmed to do so.
- Coding
- artificial intelligence
- reinforcement learning
- quality assurance
Explanation: When referring to the process of teaching a computer to recognize patterns in data without being specifically programmed to do so, the phrases machine learning and artificial intelligence are both used interchangeably.
Q: An analytics team at a college works on a task involving categorical
variables. Which of the following variables might be part of the project
dataset? Select all that apply.
- Number of books in a classroom
- Languages spoken at the college
- Student nationalities
- Teacher subject area expertise
Explanation: One way in which these variables classify or categorize items is by dividing them into separate groups or categories, such as the many languages spoken by students or their nations. They are neither continuous numerical values such as "Number of books in a classroom" nor a direct categorical variable such as "Teacher subject area expertise," which normally belongs to a different form of categorical categorization that is not directly relevant to the examples included in this list.
Q: Which of the following statements accurately describes content-based
filtering? Select all that apply.
- Content-based filtering effectively makes recommendations across content
types.
- Content-based filtering does not require information from other users to
work properly.
- Content-based filtering properties often have to be selected and mapped
manually.
- Content-based filtering recommends more of what a user likes.
Explanation: It is not necessary to get information from other users to provide recommendations for things using content-based filtering since it takes into account the user's previous interactions as well as the characteristics of the items themselves. By concentrating on the user's individual tastes, content-based filtering makes recommendations for goods that are comparable to those that the user has loved or engaged with in the past.
Q: Fill in the blank: A key benefit of collaborative filtering is that
it finds hidden _____ in the data.
- duplicates
- correlations
- contradictions
- errors
Explanation: A strategy for making recommendations, known as collaborative filtering, discovers patterns or connections between users and goods (products, services, etc.) based on the preferences or actions of the users. By examining these patterns, collaborative filtering can provide recommendations to users based on commonalities with other users who have likes or tastes that are comparable to their own. Because of this, it unearths previously concealed connections within the data that would not be readily obvious via the use of other methods.
Q: A data professional is considering whether the data they are using to
build a model is well-sourced. Which PACE stage does this scenario describe?
- Plan
- Analyze
- Construct
- Execute
Explanation: Investigating the data, gaining a grasp of its properties, seeing trends, and getting the data ready for modeling are all essential steps.
Q: Which of the following statements accurately describe Python
notebooks and scripts? Select all that apply.
- Python scripts are useful for pairing code with human-readable
descriptions and outputs.
- Python notebooks are executed by a computer without the need for human
supervision.
- Data professionals often alternate between Python notebooks and
scripts.
- Data professionals can use both Python notebooks and scripts to execute
code.
Explanation: Python notebooks, which are similar to Jupyter notebooks, provide an interactive environment in which code may be performed in cells. This enables rapid feedback and visualization of the code. Scripts written in Python, on the other hand, are documents that may be run independently from the command line or included in more extensive projects.
Q: Fill in the blank: The data visualization package _____ is designed
primarily for statistical visualization.
- Tableau
- Plotly
- Matplotlib
- Seaborn
Explanation: Developed on top of Matplotlib, Seaborn offers a more advanced user interface to produce statistical visualizations that are both visually appealing and visually instructive. In comparison to Matplotlib itself, it is meant to function well with structured data and often facilitates the production of complicated visualizations such as statistical plots, categorical plots, and more advanced visualizations. Its architecture was particularly developed to operate well with structured data.
Q: Fill in the blank: In a typical business, a data professional is most
likely to request assistance from the _____ department to obtain preliminary
information about a dataset.
- Sales
- information technology
- business intelligence
- marketing
Explanation: The collection, analysis, and presentation of business information to assist decision-making processes inside an organization are normally the responsibilities of business intelligence (BI) departments. Before beginning their analysis or modeling duties, data professionals often have access to data repositories, data warehouses, and analytical tools that may give essential context and insights regarding datasets. These tools can facilitate the process of analyzing and modeling data.
Q: A data analytics team at a household goods manufacturer works on a
task involving discrete variables. Which of the following variables might be
part of the project dataset? Select all that apply.
- Type of most popular toaster
- Total days a sale lasts in March
- Number of appliances for sale at a retail store
- Amount of people in a household
Explanation: The phrase "total days a sale lasts in March" often refers to continuous figures (for example, 1.5 days or 3.2 days).If it reflects counts (for example, two people or three people), then the phrase "amount of people in a household" might be called discrete. On the other hand, if it represents a range or an average, then it could be termed continuous.
Q: Which of the following statements accurately describe content-based
filtering? Select all that apply.
- Content-based filtering properties never have to be selected and mapped
manually.
- Content-based filtering does not require information from other users to
work properly.
- Content-based filtering is ineffective at making recommendations across
content types.
- Content-based filtering can go beyond comparing items to recommending
other things that match a user’s preferences.
Explanation: Instead of depending on the information provided by other users, content-based filtering makes recommendations for things based on the characteristics and attributes of the objects themselves. Based on the preferences of the user, content-based filtering may provide recommendations for products that have characteristics or attributes that are comparable to those that the user has loved or engaged with in the past.
Q: A data professional is considering whether the data they are using
to build a model is appropriate. Which PACE stage does this scenario
describe?
- Construct
- Execute
- Analyze
- Plan
Explanation: includes describing the issue, collecting requirements, gaining a grasp of the data sources, and preparing the solution to fulfill the business need before moving forward. During this stage, it is necessary to make certain that the data required for analysis and modeling is suitable, relevant, and of adequate quality.
Q: What are some advantages of Python notebooks? Select all that apply.
- They automatically choose the best machine learning model to use for a
data project.
- They are useful for pairing code with human-readable descriptions and
outputs.
- Noncode elements can be embedded directly into the file.
- They offer functional advantages, such as the ability to export PDF
files.
Explanation: Python notebooks do not automatically choose the most appropriate machine learning model, nor do they provide any practical benefits, such as the capability to generate PDF files straight from the interface of the notebook. In most cases, these activities demand the use of extra libraries or tools that are not part of the notebook environment. As a result, the benefits of using Python notebooks are mostly associated with their capacity to provide documentation, the incorporation of code with explanations, and the incorporation of non-code items inside a single interactive document.
Q: Fill in the blank: A data professional may request assistance from
the _____ department to find out what hardware and software are available for a
data project.
- sales
- business intelligence
- marketing
- information technology
Explanation: It is usually the responsibility of the information technology (IT) department to oversee the management of the organization's technological infrastructure, which encompasses the organization's hardware, software, networks, and data storage. Within the company, they provide assistance and direction to the technological skills and resources that are available for a variety of projects, including those that are associated with data.
Q: Fill in the blank: In the process of _____, policies will change
depending on whether a reward or punishment is received.
- quality assurance
- artificial intelligence
- deep learning
- reinforcement learning
Explanation: The paradigm of machine learning known as reinforcement learning is one in which an agent learns to make choices by interacting with its surroundings with the help of reinforcement. The feedback that the agent gets comes in the form of rewards or penalties depending on its activities, and it adapts its behavior to maximize the cumulative reward for many periods. Therefore, in the process of reinforcement learning, policies (decisions or tactics) are modified per the results (rewards or penalties) that are a consequence of the activities that the agent does.
Q: A data professional at a construction company works on a task
involving continuous variables. Which of the following variables might be part
of the project dataset? Select all that apply.
- The number of pallets on a truck
- The age of a building
- The height of a skyscraper
- The weight of a concrete block
Explanation: Variables such as "The number of pallets on a truck" are examples of discrete variables since they reflect separate counts (for example, one pallet or two pallets). On the other hand, continuous variables include measurements that may exist anywhere within a range and can take on any value. Therefore, the age of a building, the height of a skyscraper, and the weight of a concrete block are the continuous variables that are relevant to the project dataset that the construction business is working on.
Q: Fill in the blank: One benefit of collaborative filtering is that it
can effectively _____ across content types.
- make recommendations
- produce metadata
- eliminate outliers
- visualize data
Explanation: The method of recommendation known as collaborative filtering can provide suggestions to users about objects (such as movies, goods, and articles) based on the tastes and actions of other users who are similar to the person in question. Conversely, collaborative filtering takes advantage of user interactions with objects and commonalities across users, in contrast to content-based filtering, which is dependent on the characteristics of the items being filtered. This enables it to propose things from a variety of content kinds, provided that there are sufficient user interactions and commonalities to utilize as a basis for making recommendations. Consequently, collaborative filtering is especially successful when it comes to creating individualized suggestions across a wide range of material genres.
Q: Which of the following applications would be well-suited to the use
of Python scripts? Select all that apply.
- A task pairs code with human-readable descriptions.
- A task that requires a human-readable output
- A program that incorporates several files
- A program that contains errors in need of debugging
Explanation: The fact that Python is capable of producing output that is readily accessible by humans makes it an appropriate choice for activities that need output that is both clear and intelligible. Python's ability to manage many files and modules efficiently, together with its support for modular programming, makes it an ideal choice for projects that include several files. When it comes to discovering and repairing faults in programs, Python is a perfect choice since its error messages are often easy to understand and provide useful information. Additionally, there are some debugging tools and approaches accessible.
Q: Fill in the blank: The data visualization package _____ is effective
when creating presentations, such as designing a data visualization for an
interactive dashboard.
- Matplotlib
- HTML
- Tableau
- Plotly
Explanation: The software for the display of data When it comes to the creation of presentations, such as the conceptualization of data visualization for an interactive dashboard, Plotly is beneficial.
Q: Fill in the blank: A data professional working on an email campaign
may request assistance from the _____ department to understand the purpose of
their data work and confirm they are working toward a clear target.
- business intelligence
- information technology
- finance
- marketing
Explanation: It is possible for a data professional who is working on an email campaign to ask for help from the marketing department to have a better understanding of the mission of their data work and to verify that they are working toward a specific goal.
Q: Which of the following statements correctly describe supervised and
unsupervised machine learning? Select all that apply.
- Supervised machine learning uses algorithms to analyze and cluster
unlabeled datasets.
- In unsupervised machine learning, data professionals ask the model to
give them information without telling the model what the answer should
be.
- Supervised machine learning uses labeled datasets to train algorithms to
classify or predict outcomes.
- Data professionals use supervised machine learning for prediction.
Explanation: On the other hand, supervised machine learning makes use of labeled datasets to train algorithms to categorize or predict outcomes. For the purpose of prediction, data professionals use supervised machine learning.
Q: Fill in the blank: Matplotlib is a type of _____, which enables data
professionals to create plots and graphs for data projects.
- data visualization package
- machine learning package
- operational package
- mathematical package
Explanation: Matplotlib is a sort of data visualization program that gives data professionals the ability to construct plots and graphs for use in data projects.
Q: Fill in the blank: The process of _____ involves models made of
layers of interconnected nodes. Each layer receives signals from its preceding
layer, and nodes that are activated pass transformed signals to another layer
or a final output.
- deep learning
- reinforcement learning
- artificial intelligence
- quality assurance
Explanation: During the process of deep learning, models are constructed using layers of nodes that are linked to one another. The layer that comes before it sends signals to the layer that comes after it, while nodes that are active send altered signals to the layer that comes after them or to the output that comes next.
Q: Fill in the blank: One drawback of collaborative filtering is that
the data has a lot of _____ values.
- Missing
- inaccurate
- conflicting
- redundant
Explanation: The data includes a significant number of missing values, which is one of the disadvantages of using collaborative filtering. Collaborative filtering is dependent on the interactions or ratings between users and items, and the absence of values might reduce the efficiency of the recommendation system.
Q: Fill in the blank: Supervised machine learning uses labeled datasets
to train _____ to classify or predict outcomes
- Clusters
- algorithms
- dashboards
- networks
Explanation: On the other hand, supervised machine learning makes use of labeled datasets in order to train algorithms to categorize or predict outcomes.