- •Contents at a glance
- •Contents
- •Introduction
- •Who this book is for
- •Assumptions about you
- •Organization of this book
- •Conventions
- •About the companion content
- •Acknowledgments
- •Errata and book support
- •We want to hear from you
- •Stay in touch
- •Chapter 1. Introduction to data modeling
- •Working with a single table
- •Introducing the data model
- •Introducing star schemas
- •Understanding the importance of naming objects
- •Conclusions
- •Chapter 2. Using header/detail tables
- •Introducing header/detail
- •Aggregating values from the header
- •Flattening header/detail
- •Conclusions
- •Chapter 3. Using multiple fact tables
- •Using denormalized fact tables
- •Filtering across dimensions
- •Understanding model ambiguity
- •Using orders and invoices
- •Calculating the total invoiced for the customer
- •Calculating the number of invoices that include the given order of the given customer
- •Calculating the amount of the order, if invoiced
- •Conclusions
- •Chapter 4. Working with date and time
- •Creating a date dimension
- •Understanding automatic time dimensions
- •Automatic time grouping in Excel
- •Automatic time grouping in Power BI Desktop
- •Using multiple date dimensions
- •Handling date and time
- •Time-intelligence calculations
- •Handling fiscal calendars
- •Computing with working days
- •Working days in a single country or region
- •Working with multiple countries or regions
- •Handling special periods of the year
- •Using non-overlapping periods
- •Periods relative to today
- •Using overlapping periods
- •Working with weekly calendars
- •Conclusions
- •Chapter 5. Tracking historical attributes
- •Introducing slowly changing dimensions
- •Using slowly changing dimensions
- •Loading slowly changing dimensions
- •Fixing granularity in the dimension
- •Fixing granularity in the fact table
- •Rapidly changing dimensions
- •Choosing the right modeling technique
- •Conclusions
- •Chapter 6. Using snapshots
- •Using data that you cannot aggregate over time
- •Aggregating snapshots
- •Understanding derived snapshots
- •Understanding the transition matrix
- •Conclusions
- •Chapter 7. Analyzing date and time intervals
- •Introduction to temporal data
- •Aggregating with simple intervals
- •Intervals crossing dates
- •Modeling working shifts and time shifting
- •Analyzing active events
- •Mixing different durations
- •Conclusions
- •Chapter 8. Many-to-many relationships
- •Introducing many-to-many relationships
- •Understanding the bidirectional pattern
- •Understanding non-additivity
- •Cascading many-to-many
- •Temporal many-to-many
- •Reallocating factors and percentages
- •Materializing many-to-many
- •Using the fact tables as a bridge
- •Performance considerations
- •Conclusions
- •Chapter 9. Working with different granularity
- •Introduction to granularity
- •Relationships at different granularity
- •Analyzing budget data
- •Using DAX code to move filters
- •Filtering through relationships
- •Hiding values at the wrong granularity
- •Allocating values at a higher granularity
- •Conclusions
- •Chapter 10. Segmentation data models
- •Computing multiple-column relationships
- •Computing static segmentation
- •Using dynamic segmentation
- •Understanding the power of calculated columns: ABC analysis
- •Conclusions
- •Chapter 11. Working with multiple currencies
- •Understanding different scenarios
- •Multiple source currencies, single reporting currency
- •Single source currency, multiple reporting currencies
- •Multiple source currencies, multiple reporting currencies
- •Conclusions
- •Appendix A. Data modeling 101
- •Tables
- •Data types
- •Relationships
- •Filtering and cross-filtering
- •Different types of models
- •Star schema
- •Snowflake schema
- •Models with bridge tables
- •Measures and additivity
- •Additive measures
- •Non-additive measures
- •Semi-additive measures
- •Index
- •Code Snippets
but this time, it happens through table expansion. The main difference between using bidirectional filtering and table expansion is that the pattern with table expansion always applies the filter, whereas the bidirectional filtering works only when the filter is active. To see the difference, let us add a new row to the Transactions table, which is not related to any account. This row has 5,000 USD and, not being related to any account, it does not belong to any customer. Figure 8- 5 shows you the result.
FIGURE 8-5 CROSSFILTER and table expansion lead to different results in the grand total.
The difference between the two measures is exactly 5,000 USD, which is the amount that is not related to any customer. It is reported at the grand total in the CROSSFILTER version, but it is not reported in the table expansion one. When you use the CROSSFILTER version at the grand total when no filter on the customer is active, the fact table shows all the rows. On the other hand, the filter is always activated when using table expansion, showing only the rows in the fact table that can be reached through one of the customers. Thus, the additional row is hidden and does not contribute to the grand total.
As it often happens in these cases, it is not that one value is more correct than the other. They are reporting different numbers following different calculations. You only need to be aware of the difference so you can use the correct formula depending on your needs. From a performance point of view, because the filter is not applied if it is not necessary, you can expect the version with CROSSFILTER to be slightly faster than the version with table expansion. CROSSFILTER and bidirectional filtering, on the other hand, report the same numbers, and behave the same way in terms of performance.
Understanding non-additivity
The second important point about many-to-many relationships is that typically, measures aggregated through a many-to-many relationship are non-additive. This is not an error in the model; it is the nature of many-to-many that makes these
relationships non-additive. To better understand this, look at the report in Figure 8-6 that shows both the Accounts and the Customers tables on the same matrix.
FIGURE 8-6 Many-to-many relationships generate non-additive calculations.
You can easily see that the column totals are correct, meaning that the total is the sum of all the rows in that column. The row totals, however, are incorrect. This is because the amount of the account is shown for all the customers who own that account. The account Mark-Paul, for example, is owned by Mark and Paul together. Individually, they have 1,000 USD each, but when you consider them together, the total is still 1,000.
Non-additivity is not a problem. It is the correct behavior whenever you work with many-to-many relationships. However, you need to be aware of nonadditivity because you can easily be fooled if you do not take it into account. For example, you might iterate over the customers, compute the sum of the amount, and then aggregate it at the end, which obtains a result that is different from the calculation done for the grand total. This is demonstrated in the report in Figure 8- 7, which shows the result of the following two calculations:
Click here to view code image
Interest := [SumOfAmount] * 0.01
Interest SUMX := SUMX ( Customers, [SumOfAmount] * 0.01 )