Week 1 – Data types and structures

 


1. A data analyst at a book publisher is working on an urgent report for executives. They are using only historical data. What is the most likely reason for choosing to analyze only historical data?

Answers

·        The project has a very short time frame

·        The data is unknown

·        There is plenty of time to research historical data

·        The data is constantly changing

Explanation: They're trying to gain insights into past trends and patterns to inform current decision-making. When historical data is analyzed, it may assist in determining what strategies have been successful in the past, what strategies have not been successful, and it can give a foundation for generating educated predictions about the outcomes of future events. It's just like going on a trip through the pages of a book to get the information you need to create the next chapter of your success story!

 

2. Which of the following are examples of discrete data? Select all that apply.

Answers

·        Box office returns

·        Movie running time

·        Movie budget

·        Number of actors in movie

 

3. Which of the following questions collects nominal qualitative data?

Answers

·        Is this your first time dining at this restaurant?

·        How many people do you usually dine with?

·        How many times have you dined at this restaurant?

·        On a scale of 1-10, how would you rate your service today?

 

Explanation: In nominal qualitative data, categories or labels are used, but there is no intrinsic order to the data.In this scenario, the comments may be organized into categories like "Mystery," "Romance," "Science Fiction," and so on, but there wouldn't necessarily be an order or rating involved.

 

4. Why is internal data considered more reliable and easier to collect than external data?

Answers

·        Internal data circumvents privacy restrictions.

·        Internal data comes from people you know.

·        Internal data has much larger sample sizes.

·        Internal data lives within a company’s own systems.

 

5. A social media post is an example of structured data.

Answers

·        True

·        False

Explanation: It's not quite that. The majority of the time, the postings made on social networking platforms are instances of unstructured data. Because unstructured data does not have a preset data model and because it often contains a lot of text, it is more difficult to organize and analyze using standard approaches.

On the other hand, structured data is meticulously arranged and prepared, and it most often resides in databases that are distinguished by their distinct categories and connections. The result of the completeness of the completeness of the number of the completeness of the completeness of the completeness of the completeness of the completeness of the completeness of the completeness of thesauced.

Even while a social media post may have some organized components (such timestamps, user IDs, or hashtags), the primary content of the post, which may include text, photographs, or videos, is often unstructured. In order to do analysis on unstructured data and get useful insights from it, more sophisticated methods, such as natural language processing, are often required.

6. Fill in the blank: A Boolean data type can have _____ possible values.

Answers

·        three

·        10

·        two

·        infinite

Explanation: The Boolean data type may take on either the value true or the value false as its potential interpretation. It represents binary logic, where the outcome is either true or false, on or off, 1 or 0.

 

7. The following is a selection from a spreadsheet:

What kind of data format does it contain?

Answers

·        Short

·        Wide

·        Narrow

·        Long

 

8. A data analyst is working in a spreadsheet application. They use Save As to change the file type from .XLS to .CSV. This is an example of a data transformation.

Answers

·        True

·        False

Explanation: Without a doubt! Converting a file from one format to another, such as changing from .XLS to .CSV, is indeed an example of a data transformation. In this particular scenario, it entails converting a spreadsheet file that is in the Excel format to a CSV format, also known as a Comma-Separated Values format. This is a standard approach to express tabular data in a format that is just plain text. Data transformations similar to this one are often performed for the purpose of facilitating interoperability with a variety of software programs or of preparing data for certain analytical procedures.

Shuffle Q/A 1

 

9. A data analyst is working on an urgent traffic study. As a result of the short time frame, which type of data are they most likely to use?

Answer

·        Theoretical

·        Historical

·        Personal

·        Unclean

Explanation: When doing an urgent research on traffic, the data analyst would most likely utilize data that is either real-time or near-real-time. This type of data provides information that is current and up-to-date, allowing for quick analysis and decision-making. Data collected in real time may come from a variety of sources, such as traffic sensors, GPS devices, or other live sources that provide direct insights on the circumstances of the road at the present moment. This is very necessary in circumstances that call for a rapid reaction or analysis, such as when it is necessary to manage the flow of traffic during peak hours or react to unforeseen occurrences while driving.

 

10. Nominal qualitative data has a set order or scale.

Answer

·        True

·        False

Explanation: In point of fact, nominal qualitative data is denoted by categories or labels and does not include any kind of intrinsic order or scale. It's the lowest level of measurement and doesn't imply any quantitative relationship between the categories. Each category is treated as distinct, and there is no inherent order or ranking among them.

In contrast, ordinal qualitative data does have a meaningful order or scale. It is possible to rank or arrange the categories that are included in ordinal data, but the distinctions that exist between them are not always consistent or observable.

Therefore, nominal data refers to categories that are not arranged in any particular order, while ordinal data includes categories that are arranged in a meaningful manner.

 

11. Internal data is more reliable because it’s clean.

Answer

·        True

·        False

Explanation: Internal data is often considered more reliable because organizations have more control over its collection, storage, and management processes. The organization's well-established processes, protocols, and quality control procedures are directly responsible for the high level of cleanliness of its internal data.

 

12. Structured data is likely to be found in which of the following formats? Select all that apply.

Answer

·        Audio file

·        Digital photo

·        Spreadsheet

·        Table

 

13. A Boolean data type must have a numeric value.

Answer

·        True

·        False

Explanation: It is not necessary for a Boolean data type to have a numerical value at all times. When it comes to programming and the representation of data, a Boolean data type normally represents two different potential values: true and false. Binary logic is typically represented by using these values, where true is generally equivalent to 1, and false is often comparable to 0 in terms of numeric representation.

 

14. In long data, separate columns contain the values and the context for the values, respectively. What does each column contain in wide data?

Answer

·        A specific constraint

·        A specific data type

·        A unique data variable

·        A unique format

Explanation: When dealing with large data, it is common practice for each column to stand for a variable or a feature, while each row stands for an individual observation or instance. In contrast to long data, which is organized such that different columns include values together with the context in which they belong, wide data is designed so that each variable has its own column.

Each column represents a distinct variable or feature. For example, if you're dealing with a dataset of students, you might have columns like "Name," "Age," "Grade," etc.

The values for all of the variables and features that are present in a particular observation or case are listed in each row. Therefore, the information that a student's name, age, and grade level may be included in a row of a student dataset.

 

15. Fill in the blank: Data transformation enables data analysts to change the _____ of the data.

Answer

·        value

·        structure

·        accuracy

·        meaning

Explanation: The structure of the data may be altered by data analysts via the process of data transformation. This procedure includes making adjustments to the structure, organization, or display of data in order to render it more appropriate for analysis, reporting, or particular activities.

 

16. Continuous data is measured and has a limited number of values.

Answer

·        True

·        False

 

Explanation: In point of fact, continuous data is the reverse of discrete data in that it is measured and may assume an endless number of values within a certain range. Continuous data are capable of being measured with a high degree of accuracy and, in theory, can take on any value that falls within a certain range. Height, weight, temperature, and time are all examples of continuous data. Another example is distance traveled.

On the other hand, discrete data is counted and has a limited number of distinct values. The number of automobiles in a parking lot, the number of pupils in a classroom, and the number of books on a shelf are all examples of discrete data.

 

17. Which of the following values are examples of a Boolean data type? Select all that apply.

Answer

·        True or false

·        Yes, no, or unsure

·        Yes or no

·        One, two, or three

 

18. If you have a short time frame for data collection and need an answer immediately, you likely will have to use historical data.

Answer

·        True

·        False

 

Explanation: Actually, if you have a short time frame and need an immediate answer, historical data might not be the most suitable option. Analyzing historical data, which relates to data from the past, may not give the real-time insights that are necessary for dealing with an urgent crisis.

In a short period of time, it is more probable that you will depend on data sources that are either real-time or near-real-time. These sources are able to supply information that is up to date, which enables one to do analysis and make decisions more quickly. Live sensors, feeds from social media platforms, and other types of technologies that offer real-time information are some examples.

Real-time data are often more pertinent to urgent and time-sensitive circumstances than historical data are. However, historical data are useful for identifying trends and patterns that develop over time.

Shuffle Q/A 2

19. Which of the following is an example of continuous data?

Answer

·        Leading actors in movie

·        Box office returns

·        Movie run time

·        Movie budget

Explanation: The measurement of temperature is an excellent illustration of continuous data since, in theory, it may take on an endless number of different values within a certain range. It can be measured with high precision, and there are no distinct, separate values.

The difference between continuous data, which is measured and may have an endless number of values within a defined range, and discrete data, which consists of unique values that are kept separate from one another, is that continuous data are measured.

 

20. Which of the following questions collect nominal qualitative data? Select all that apply.

Answer

·        How likely are you to recommend this restaurant to a friend?

·        Is this your first time dining at this restaurant?

·        Have you heard of our frequent diner program?

·        Did anyone recommend our restaurant to you today?

21. Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.

Answer

·        True

·        False

Explanation: Without a doubt! You have it completely right. Data transformation involves modifying the format, structure, or representation of data to make it more suitable for analysis, reporting, or specific tasks. Converting data from one format to another, such as changing it from a spreadsheet to a CSV file or from a database to a different file type, is a classic example of data transformation. It's like giving your data a makeover to make it more compatible and accessible for different purposes.

22. Which of the following is a benefit of internal data?

Answer

·        Internal data is less vulnerable to biased collection.

·        Internal data is the only data relevant to the problem.

·        Internal data is less likely to need cleaning.

·        Internal data is more reliable and easier to collect.

Explanation: Internal data is generated and collected within the organization's own systems and processes. This level of control allows the organization to design and implement standardized data collection methods, ensuring data accuracy, consistency, and reliability. With control over the data generation process, organizations can better manage the quality of their internal data.

Post a Comment

Previous Post Next Post