The world of data - For beginners and experienced data analysts
Entering the world of data and continuously developing your own data literacy is a challenge. Our new series gives beginners a practice-oriented approach to the world of data, while advanced data users have the opportunity to improve their knowledge.
Polyteia is well aware of the complexity that the world of data holds and how challenging it can be to get started and to continuously develop your own data skills. With our new series, we offer insights on two levels: For beginners and for experienced data analysts. For beginners, the aim is to provide access to the world of data with practical relevance. Advanced data analysts are given the opportunity to expand their knowledge.
The first section explains the term data and how it can appear in the daily work of the public sector. You do not need any knowledge of the topic and can easily take your first step into the world of data. If you already have some experience of working with data, you can skip straight to the second section, where you will find detailed information on data types and databases.
For beginners: Basic knowledge of data
What is data? The question may seem very simple at first. But knowledge can only be successfully expanded upon a good knowledge basis.
Data can appear in your day-to-day work in many different ways: Numbers, texts, images, audio files and much more. However, all forms have one thing in common: they provide information. For example, a number can show the amount of schools, kindergartens or even the retirees, students and workers in your local area. A text, on the other hand, can define the place of residence or nationality of citizens.
Almost everyone has come in contact with a data set at some point. Data sets can be, for example, excel worksheets, tables in databases or - non-electronically - an index card in a index register. A data set is a list of a group of related data fields. You have probably already created a data set when you noted information such as name, address and date of birth for citizens in a table. As a result, you are already working with data every day, for example when you record your city's coronavirus cases or your department's expenses in an excel file. Data does not necessarily have anything to do with complex software on your computer. But they can help you to understand data better and use it more effectively.
For data analysts: Data types and databases
As you probably already know, the value of data does not lie in its mere existence, but rather in its ability to support decision-making and generate knowledge. Data is a valuable resource that needs to be carefully managed, analyzed and protected in order to reap sustainable benefits.
In the public sector, we mainly deal with structured data, which are numerical values and texts. These are symbolic representations of information in a formalized or structured form, organized in clearly defined fields or categories. A typical example of structured data is databases, in which information is organized in tables with columns and rows. In contrast, there is unstructured data that has no clear organizational structure, such as text documents, emails or multimedia content. Structuring data is important because it can significantly improve the efficiency of data processing, analysis and use, especially in areas such as database management, data mining or machine learning.
The structured collection of raw data in an Excel spreadsheet or in cloud-based software creates a data set that can be found within a database. This can contain one or more variables that contain information about a specific entity or event. Each row in a data set usually represents a single observation or instance, while each column represents a specific characteristic or attribute of information. There is no fixed form of the dataset and it can be easily transposed. The data within the database is categorized into different data types. There are different data types for time data: Time, date or the two combined as a timestamp. Boolean, on the other hand, is the data type for truth values and can either have the value true or false. To ensure accuracy and precision in calculations, decimal data types are primarily used in financial and scientific applications. These offer the largest number of significant digits for a number.
Databases also come in different forms. There are graph databases, for example, which are specialized in storing and displaying data in the form of graphs and are often used in applications such as social networks, recommendation systems and network analyses. They use nodes and edges to model complex relationships between the data. Other types of databases include time series databases, which specialize in storage and efficient retrieval of time series data and are collected at regular time intervals. They are often used in applications such as IoT (Internet of Things), financial analyses and monitoring systems. Relational, NoSQL and object-oriented databases are other forms in which databases can appear.
Develop your data literacy further
Have a look at our free Data Academy learning platform. It supports administrative staff of all levels of knowledge in developing and expanding their data literacy. The Data Academy's interactive online courses cover a variety of topics, including data visualization, data transformation, data platforms, data governance, artificial intelligence and much more. You can register quickly and easily for Polyteia's free Data Academy learning platform via this link.