Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Analyzing Data with Power BI and Power Pivot for Excel (Alberto Ferrari, Marco Russo) (z-lib.org).pdf
Скачиваний:
11
Добавлен:
14.08.2022
Размер:
18.87 Mб
Скачать

Introduction

Excel users love numbers. Or maybe it’s that people who love numbers love Excel. Either way, if you are interested in gathering insights from any kind of dataset, it is extremely likely that you have spent a lot of your time playing with Excel, pivot tables, and formulas.

In 2015, Power BI was released. These days, it is fair to say that people who love numbers love both Power Pivot for Excel and Power BI. Both these tools share a lot of features, namely the VertiPaq database engine and the DAX language, inherited from SQL Server Analysis Services.

With previous versions of Excel, gathering insights from numbers was mainly a matter of loading some datasets and then starting to calculate columns and write formulas to design charts. Yes, there were some limitations: the size of the workbook mattered, and the Excel formula language was not the best option for huge number crunching. The new engine in Power BI and Power Pivot is a giant leap forward. Now you have the full power of a database and a gorgeous language (DAX) to leverage. But, hey, with greater power comes greater responsibility! If you want to really take advantage of this new tool, you need to learn more. Namely, you need to learn the basics of data modeling.

Data modeling is not rocket science. It is a basic skill that anybody interested in gathering insights from data should master. Moreover, if you like numbers, then you will love data modeling, too. So, not only is it an easy skill to acquire, it is also incredibly fun.

This book aims to teach you the basic concepts of data modeling through practical examples that you are likely to encounter in your daily life. We did not want to write a complex book on data modeling, explaining in detail the many complex decisions that you will need to make to build a complex solution. Instead, we focused on examples coming from our daily job as consultants. Whenever a customer asked us to help solve a problem, if we felt the issue is something common, we stored it in a bin. Then, we opened that bin and provided a solution to each of these examples, organizing them in a way that it also serves as a training on data modeling.

When you reach the end of the book, you will not be a data-modeling guru, but you will have acquired a greater sensibility on the topic. If, at that time, you look at your database, trying to figure out how to compute the value you need, and you start to think that—maybe—changing the model might help, then we will have

accomplished our goal with this book. Moreover, you will be on your path to becoming a successful data modeler. This last step—that is, becoming a great data modeler—will only come with experience and after many failures. Unfortunately, experience is not something you can learn in a book.

Who this book is for

This book has a very wide target of different kind of people. You might be an Excel user who uses Power Pivot for Excel, or you may be a data scientist using Power BI. Or you could be starting your career as a business-intelligence professional and you want to read an introduction to the topics of data modeling. In all these scenarios, this is the book for you.

Note that we did not include in this list people who want to read a book about data modeling. In fact, we wrote the book thinking that our readers probably do not even know they need data modeling at all. Our goal is to make you understand that you need to learn data modeling and then give you some insights into the basics of this beautiful science. Thus, in a sentence if you are curious about what data modeling is and why it is a useful skill, then this is the book for you.

Assumptions about you

We expect our reader to have a basic knowledge of Excel Pivot Tables and/or to have used Power BI as a reporting and modelling tool. Some experience in analysis of numbers is also very welcome. In the book, we do not cover any aspect of the user interface of either Excel or Power BI. Instead, we focus only on data models, how to build them, and how to modify them, so that the code becomes easier to write. Thus, we cover “what you need to do” and we leave the “how to do it” entirely to you. We did not want to write a step-by-step book. We wanted to write a book that teaches complex topics in an easy way.

One topic that we intentionally do not cover in the book is the DAX language. It would have been impossible to treat data modeling and DAX in the same book. If you are already familiar with the language, then you will benefit from reading the many pieces of DAX spread throughout this book. If, on the other hand, you still need to learn DAX, then read The Definitive Guide to DAX, which is the most comprehensive guide to the DAX language and ties in well with the topics in this book.

Organization of this book

The book starts with a couple of easy, introductory chapters followed by a set of

monographic chapters, each one covering some specific kind of data model. Here is a brief description of each chapter:

Chapter 1, “Introduction to data modeling,” is a brief introduction to the basic concepts of data modeling. Here we introduce what data modeling is, we start speaking about granularity, and we define the basic models of a data warehouse—that is, star schemas, snowflakes, normalization, and denormalization.

Chapter 2, “Using header/detail tables,” covers a very common scenario: that of header/detail tables. Here you will find discussions and solutions for scenarios where you have, for example, orders and lines of orders in two separate fact tables.

Chapter 3, “Using multiple fact tables,” describes scenarios where you have multiple fact tables and you need to build a report that mixes them. Here we stress the relevance of creating a correct dimensional model to be able to browse data the right way.

Chapter 4, “Working with date and time,” is one of the longest of the book. It covers time intelligence calculations. We explain how to build a proper date table and how to compute basic time intelligence (YTD, QTA, PARALLELPERIOD, and so on), and then we show several examples of working day calculations, handling special periods of the year, and working correctly with dates in general.

Chapter 5, “Tracking historical attributes,” describes the use of slowly changing >dimensions in your model. This chapter provides a deeper explanation of the transformation steps needed in your model if you need to track changing attributes and how to correctly write your DAX code in the presence of slowly changing dimensions.

Chapter 6, “Using snapshots,” covers the fascinating aspects of snapshots. We introduce what a snapshot is, why and when to use them, and how to compute values on top of snapshots, and we provide a description of the powerful transition matrix model.

Chapter 7, “Analyzing date and time intervals,” goes several steps forward from Chapter 5. We cover time calculations, but this time analyzing models where events stored in fact tables have a duration and, hence, need some special treatment to provide correct results.

Chapter 8, Many-to-many relationships,” explains how to use many-to- many relationships. Many-to-many relationships play a very important role in any data model. We cover standard many-to-many relationships,

cascading relationships, and their use with reallocation factors and filters, and we discuss their performance and how to improve it.

Chapter 9, “Working with different granularity,” goes deeper into working with fact tables stored at different granularities. We show budgeting examples where the granularity of fact tables is different and provide several alternatives both in DAX and in the data model to solve the scenario.

Chapter 10, “Segmentation data models,” explains several segmentation models. We start with a simple segmentation by price, we then move to the analysis of dynamic segmentation using virtual relationships, and finally we explain the ABC analysis done in DAX.

Chapter 11, “Working with multiple currencies,” deals with currency exchange. When using currency rates, it is important to understand the requirements and then build the proper model. We analyze several scenarios with different requirements, providing the best solution for each.

Appendix A, “Data modeling 101,” is intended to be a reference. We briefly describe with examples the basic concepts treated in the whole book. Whenever you are uncertain about some aspect, you can jump there, refresh your understanding, and then go back to the main reading.

The complexity of the models and their solutions increase chapter by chapter, so it is a good idea to read this book from the beginning rather than jumping from chapter to chapter. In this way, you can follow the natural flow of complexity and learn one topic at a time. However, the book is intended to become a reference guide, once finished. Thus, whenever you need to solve a specific model, you can jump straight to the chapter that covers it and look into the details of the solution.

Conventions

The following conventions are used in this book:

Boldface type is used to indicate text that you type.

Italic type is used to indicate new terms.

Code elements appear in a monospaced font.

The first letters of the names of dialog boxes, dialog box elements, and commands are capitalized—for example, the Save As dialog box.

Keyboard shortcuts are indicated by a plus sign (+) separating the key names. For example, Ctrl+Alt+Delete mean that you press Ctrl, Alt, and Delete keys at the same time.