1. Fill in the blank: In spreadsheets, data analysts begin _____ with an equal sign (=).
- cells
- charts
- formulas
- numbers
Explanation: Formulas that are created by data analysts in spreadsheets are often started with an equal symbol (=).
2. What do data analysts use to label the type of data contained in each column in a spreadsheet?
- Menus
- Attributes
- Tables
- Headings
Explanation: Identifying the kind of data that is included in each column of a spreadsheet is accomplished by data analysts via the use of headers or column labels.
3. To determine an organization’s annual budget, a data analyst might use a slideshow.
- True
- False
Explanation: During the process of presenting or communicating the annual budget analysis to stakeholders, the creation of a slideshow might be a part of the process. However, the actual determination of the annual budget would most likely involve more comprehensive data analysis, financial modeling, and possibly the use of dedicated budgeting software or tools. In order to successfully convey the results, insights, and suggestions that were produced from the data analysis to the appropriate audience, the presentation acts as a way of information dissemination.
4. Which of the following statements describes a key difference between formulas and functions?
- Formulas span two or more cells, and functions exist in only one cell.
- Formulas are written by the user, and functions are already defined.
- Formulas are used in graphs, and functions are not.
- Formulas contain words and numbers, and functions contain numbers only.
Explanation: Formulas are user-input expressions that perform calculations on values in a spreadsheet, typically beginning with an equal sign (=). On the other hand, a function is a predefined operation or set of operations that takes one or more inputs, which are referred to as arguments, and returns a result. This primary distinction between formulas and functions is one of the most important distinctions between the two entities. Through the provision of a standardized method for carrying out certain operations, functions are built-in and have the potential to simplify complicated computations.
5. In the function =MAX(A1:A12), what does A1:A12 represent?
- The maximum
- The formula
- The range
- The operator
Explanation: A range of cells in a spreadsheet is denoted by the expression =MAX(A1:A12), where A1:A12 is the range of cells. To be more specific, it represents all of the values that range from A1 to A12 in the cells. In this particular scenario, the MAX function would come back with the greatest possible result that falls inside that range.
6. What is the correct spreadsheet formula for multiplying cell D5 times cell D7?
- =D5^D7
- =D5/D7
- =D5*D7
- =D5xD7
7. Fill in the blank: By negatively influencing data collection, ____ can have a detrimental effect on analysis.
- bias
- objectivity
- partiality
- filtering
Explanation: It is possible for bias to have a negative impact on analysis since it might have a negative influence on data collecting.
8. Which of the following are ways that data analysts can add context to their data? Select all that apply.
- Create reports for stakeholders
- Consider where the data came from
- Ask questions about the data
- Use descriptive column headers
9. Both formulas and functions in spreadsheets begin with what symbol?
- Colon (:)
- Bracket ([)
- Hyphen (-)
- Equal sign (=)
Explanation: When working with spreadsheets, the equal symbol (=) is the starting point for both formulae and functions.
10. A data analyst could use spreadsheets to achieve which of the following tasks?
- Predict next quarter’s sales
- Motivate employees
- Build code for a new app
- Write reports
11. Formulas are created by the user, whereas functions are preset commands in spreadsheets.
- True
- False
12. In the function =MAX(G3:G13), what does G3:G13 represent?
- The range
- a table
- an attribute
- an observation
Explanation: Within the context of the function =MAX(G3:G13), the range of cells G3:G13 is a representation of a spreadsheet. Particularly, it indicates all of the values that are present in the cells ranging from G3 to G13. In this particular scenario, the MAX function would come back with the greatest possible result that falls inside that range.
13. What is the correct spreadsheet formula for multiplying cell K3 times cell K8?
- =K3*K8
- =K3^K8
- =K3xK8
- =K3/K8
14. To avoid bias when collecting data, a data analyst should keep what in mind?
- Opinion
- Context
- Graphs
- Stakeholders
15. Attributes are used in spreadsheets for what purpose?
- Label the data in each column
- Insert data into each column
- Analyze the data in a row
- Add a new column
16. A data analyst might use descriptive column headers in order to achieve what goal?
- Alphabetize the spreadsheet data
- Filter the data
- Add context to their data
- Protect the spreadsheet
Explanation: It is important for a data analyst to include descriptive column headings in a spreadsheet in order to accomplish the aim of improving the clarity and comprehension of the data contained within the spreadsheet.When it comes to data analysis, descriptive column headings are considered to be a best practice since they improve communication and understanding of the information that is included in a spreadsheet.
17. Which of the following statements accurately describe formulas and functions? Select all that apply.
- Formulas and functions assist data analysts in calculations, both simple and complex.
- Functions are preset commands that perform calculations.
- Formulas are instructions that perform specific calculations.
- Formulas may only be used once per spreadsheet column.
Explanation: In spreadsheets, functions are preset procedures or groups of operations that accept one or more inputs, which are referred to as arguments, and produce a result. Through the provision of a standardized method for carrying out certain operations, functions are built-in and have the potential to simplify complicated computations.
18. In the function =MAX(B5:B15), what does B5:B15 represent?
- Column
- Attribute
- Observation
- Range
Explanation: When referring to a range of cells in a spreadsheet, the range B5:B15 is represented by the function =MAX(B5:B15). To be more specific, it represents all of the values that are included inside the cells ranging from B5 to B15. In this particular scenario, the MAX function would come back with the greatest possible result that falls inside that range.
19. What is the correct spreadsheet formula for multiplying cell H2 times cell H5?
- =H2xH5
- =H2/H5
- =H2*H5
- =H2^H5
20. Relational databases contain a series of tables connected to form relationships. Which two types of fields exist in two connected tables?
- Primary and foreign keys
- Internal and external data
- Descriptive and structural metadata
- Star and snowflake schemas
21. Data analysts use metadata for what tasks? Select all that apply.
- To perform data analyses
- To evaluate the quality of data
- To interpret the contents of a database
- To combine data from more than one source
22. Structural metadata indicates how a piece of data is organized and whether it’s part of one or more than one data collection.
- True
- False
Explanation: That's just right! The information that is provided by structural metadata pertains to the arrangement and underlying structure of the data. It provides information on the organization of the data, the links that exist between the various data parts, and whether or not a certain piece of data is included in one or more data collections. For the purpose of comprehending the structure and the environment in which data is kept and retrieved inside a database or system, this particular kind of metadata is absolutely necessary.
23. What is the process that data analysts use to ensure the formal management of their company’s data assets?
- Data mapping
- Data governance
- Data aggregation
- Data integrity
Explanation: It is common practice to refer to the procedure that data analysts use in order to guarantee the formal management of the data assets that belong to their organization as "Data Governance." In order to guarantee the efficient and responsible administration of an organization's data during its entire lifespan, the process of data governance include the formulation and execution of rules, processes, and standards.
24. A data analyst chooses not to use external data because it represents diverse perspectives. This is an appropriate decision when working with external data.
- True
- False
Explanation: In point of fact, a judgment that states that a data analyst should not utilize external data only due to the fact that it offers a variety of viewpoints would not be the most suitable choice. An comprehension of the topic that is both more extensive and accurate may be achieved via the use of a variety of viewpoints in the process of data analysis.
The use of external data sources may provide significant insights and alternative points of view, which can enhance the analysis and help to prevent bias that may be the consequence of depending primarily on data sources from inside the organization. It is essential for data analysts to properly examine and verify external data sources, taking into consideration a variety of aspects including dependability, trustworthiness, and relevance to the analysis aims.
A better strategy would be to evaluate the quality and usefulness of the external data, and then combine it in a strategic manner with the data from inside the organization. This would allow for a more comprehensive understanding of the research being conducted. The use of a variety of data sources may result in findings that are more reliable and well-informed.
25. A data analyst reviews a database of Wisconsin car sales to find the last car models sold in Milwaukee in 2019. How can they sort and filter the data to return the last five cars sold at the top of their list? Select all that apply.
- Sort by sale date in ascending order
- Sort by sale date in descending order
- Filter out sales outside of Milwaukee
- Filter out sales not in 2019
26. When writing a query, the name of the dataset can either be inside two backticks, or not, and the query will still run properly.
- True
- False
Explanation: According to the particular database or query language that is being used, the usage of backticks to enclose the name of a dataset is dependent on the specific database. Additionally, backticks, sometimes known as back quotation marks, are used in certain query languages and databases for the purpose of delimiting identifiers, such as the names of tables or columns, particularly in cases when the names include spaces or other special characters.
27. You are working with a database table that contains customer data. The first_name column lists the first name of each customer. You are only interested in customers with the first name Mark.
You write the SQL query below. Add a WHERE clause that will return only customers named Mark.
SELECT
*
FROM
customer
How many customers are named Mark?
- 5
- 2
- 3
- 1
28. When working with data from an external source, what can metadata help data analysts do? Select all that apply.
- Choose which analyses to run
- Combine data from more than one source
- Understand the contents of a database
- Ensure data is clean and reliable
29. Think about data as driving a taxi cab. In this metaphor, which of the following are examples of metadata? Select all that apply.
- Passengers the taxi picks up
- Make and model of the taxi cab
- License plate number
- Company that owns the taxi
30. What are some key benefits of using external data? Select all that apply.
- External data is free to use.
- External data is always reliable.
- External data can provide industry-level perspectives.
- External data has broad reach.
31. A data analyst reviews a national database of movie theater showings. They want to find the first movies shown in San Francisco in 2001. How can they organize the data to return the first 10 movies shown at the top of their list? Select all that apply.
- Sort by date in descending order
- Sort by date in ascending order
- Filter out showings outside of San Francisco
- Filter out showings not in 2001
Explanation: When considering the advantages of utilizing external data, it is essential to keep in mind that these advantages are contingent on the quality, relevancy, and dependability of the data sources. It is vital to do proper validation and provide careful thought in order to guarantee that the data obtained from external sources are suitable for certain analyses or decision-making procedures.
32. You are working with a database table that contains customer data. The city column lists the city where each customer is located. You want to find out which customers are located in Berlin.
You write the SQL query below. Add a WHERE clause that will return only customers located in Berlin.
SELECT
*
FROM
customer
How many customers are located in Berlin?
- 9
- 12
- 2
- 7
33. Primary and foreign keys are two connected identifiers within separate tables. These tables exist in what kind of database?
- Primary
- Metadata
- Normalized
- Relational
Explanation: Relational databases, such as MySQL, PostgreSQL, Microsoft SQL Server, and Oracle, make use of the idea of primary and foreign keys in order to ensure the integrity of the data and to construct links between the tables.
34. Fill in the blank: Data governance is the process of ensuring that a company’s _____ are managed in a formal manner.
- data assets
- business tasks
- business strategies
- data engineers
Explanation: The process of ensuring that the data assets of an organization are handled in a specific and organized manner is referred to as data governance.
35. A nonprofit maintains a list of how many laptops they provide to each school in the county. In the table, there is a column called number_of_laptops. A data analyst wants to determine which schools were given the fewest laptops. How should they sort the data to return these schools first?
- Sort numerically in descending order
- Sort alphabetically in ascending order
- Sort numerically in ascending order
- Sort alphabetically in descending order
Explanation: The "number_of_laptops" column should be used to arrange the data in ascending order so that the data analyst may discover which schools were provided with the least amount of laptops that were distributed. In this way, the schools that have the fewest number of laptops would be arranged first, which would make it simpler for the analyst to identify and concentrate on those individuals schools.
36. When writing a query, you must remove the two backticks around the name of the dataset in order for the query to run properly.
- True
- False
Explanation: When conducting a query, the exact database or query language that you are using will determine whether or not you are required to place backticks around the name of the dataset before or after the query. When attempting to delimit identifiers in some databases, such as MySQL, it is important to make use of backticks. This is particularly true in situations when the identifier comprises spaces or special characters. In other databases, such as PostgreSQL or SQL Server, backticks are not used; instead, double quotes or square brackets may be utilized. Backticks are not used.
For the purpose of gaining an understanding of the appropriate syntax for identifiers, it is essential to check the documentation of the particular database or query language that you are working with. There are a few different ways that backticks may be used, and it is crucial to adhere to the right syntax in order for the query to execute correctly.
37. Think about data as a student at a high school. In this metaphor, which of the following are examples of metadata? Select all that apply.
- Grades the student earns
- Classes the student is enrolled in
- Student’s ID number
- Student’s enrollment date
38. Fill in the blank: Data _____ is the process of ensuring the formal management of a company’s data assets.
- aggregation
- governance
- mapping
- integrity
Explanation: The practice of guaranteeing the formal management of a company's data assets is comprised of the process known as data governance.
39. In what circumstance might a data analyst choose not to use external data in their analysis?
- The data cannot be confirmed to be reliable
- The data is free for anyone to access
- The data represents diverse perspectives
- The data is too thorough