Dax Zero to Developer
Dax Zero to Developer
by Tajamul Khan
PREREQUISITES DAX Comments TIME INTELLIGENCE
Data vs Lookup tables Error Handling Data Table
Primary vs Foreign Key Variables Calendar
Cardinality SCALAR FUNCTIONS Performance Till Date
Filter Flow Aggregate Functions Time Period Shift
CONTENTS
Iterator Functions Running Total Functions
DAX
Introduction Round Functions PERFORMANCE TUNING
1
Data Blogs Follow Tajamul Khan
PRIMARY KEY
they uniquely identify each row of a table and cannot have NULL value
2
Data Blogs Follow Tajamul Khan
CARDINALITY
refers to the uniqueness of values in a column. It can be of many types
One to Many
Many to One
Many to Many
One to One
For our purposes, all relationships in the data model should follow a “ one to many ”
cardinality; one instance of each primary key to many instances of each foreign key
FILTER FLOW
Filter Flow always runs “ downstream ” from Lookup to Data Table
Filters cannot flow “ upstream ” (against the direction)
3
DAX
Data Blogs Follow Tajamul Khan
DAX
DAX also known as Data Analysis Expressions is a functional
language i.e. the execution flows with function calls, It is used in
Power BI
Analysis Services Tabular
Power Pivot
It resembles excel because it was born with PowerPivot
FORMATTING
Code formatting is of paramount importance in DAX.
=SUMX (FILTER ( VALUES ( 'Date'[Year] ), 'Date'[Year] < 2005),
IF ( 'Date'[Year] >= 2000,
[Sales Amount] * 100,
[Sales Amount] * 90) )
4
Data Blogs Follow Tajamul Khan
DAX ENGINES
DAX is powered by two internal engines (formula engine & storage engine) which work
together to compress & encode raw data and evaluate DAX queries
FORMULA ENGINE
Receives, interprets and executes all DAX requests
Processes the DAX query then generates a list of logical steps called a query plan
Works with the datacache sent back from the storage engine to evaluate the DAX query
and return a result
STORAGE ENGINE
Compresses and encodes raw data, and only communicates with the formula engine
(doesn’t understand the DAX language)
Receives a query plan from Formula Engine, executes it, and returns a datacache
5
Data Blogs Follow Tajamul Khan
6
Data Blogs Follow Tajamul Khan
Data types represent how values are stored by the DAX storage engine
7
Data Blogs Follow Tajamul Khan
DAX OPERATORS
8
Data Blogs Follow Tajamul Khan
VERTIPAQ
VertiPaq uses a columnar data structure , which stores data as individual columns (rather
than rows or full tables) to quickly and efficiently evaluate DAX queries
9
Data Blogs Follow Tajamul Khan
CALCULATED COLUMNS
Allow you to add new, formula based columns to tables
Values are calculated based on information from each row of a table (has row context
Appends static values to each row in a table and stores them in the model ( which
increases file size).
Recalculate on data source refresh or when changes are made to component columns
Primarily used as rows , columns , slicers or filters
10
Data Blogs Follow Tajamul Khan
MEASURES
Values are calculated based on information from any filters in the report (has filter context
Does not create new data in the tables themselves
Recalculate in response to any change to filters within the report
Almost always used within the values field of a visual
12
Data Blogs Follow Tajamul Khan
EVALUATION CONTEXT
Evaluation contexts are the pillars of DAX i.e., Filter and Row
Row Context
Iterates rows
13
Data Blogs Follow Tajamul Khan
FILTER CONTEXT
Filter context filters the tables in your data model
DAX creates filter context when dimensions are added to rows , columns , slicers & filters
CALCULATE can be used to systematically create or modify existing filter context
Filter context always travels (propagates) from the ONE side to the MANY side of a table
relationship
14
Data Blogs Follow Tajamul Khan
ROW CONTEXT
Row context iterates through the rows in a table
DAX creates row context when you add calculated columns to your data model
Iterator functions (SUMX, RANKX, etc.) use row
context to evaluate row level calculations
Row context doesn't automatically propagate through table relationships (need to use
RELATED or RELATEDTABLE functions)
15
GOOD PRACTICES
Data Blogs Follow Tajamul Khan
NAMING CONVENTIONS
Measures should not belong to a table
• Avoid table name
• [Margin%] instead of Sales[Margin%]
• Easier to move to another table
• Easier to identify as a measure
Columns → Table[Column]
Measures → [Measure]
16
Data Blogs Follow Tajamul Khan
EVALUATION ORDER
is the process by which DAX evaluates the parameters in a function
NON NESTED
IF( Test, True, False )
1 2 3
NON NESTED
SUMX
FILTER(
FILTER ( ‘Table’
RELATED ( ‘Table’[Column], 1 = Inner Most Filter
RELATED( ‘Table’[Column]), 2
‘Table’[Column]), 3 = Outer
17
Data Blogs Follow Tajamul Khan
DAX SHORTCUTS
18
Data Blogs Follow Tajamul Khan
Bad comment
Total Sales = SUM(Sales[SalesAmount]) --Sum the sales amount
Good comment
Total Sales = SUM(Sales[SalesAmount]) --Calculate total sales amount
19
Data Blogs Follow Tajamul Khan
ERROR HANDLING
Error handling functions can be used to help identify missing data, and can be
particularly useful for quality assurance and testing
IFERROR()
Returns a value if first expression is an error and the value of
the expression itself otherwise
ISBLANK()
Checks to see if a value is blank, returns True or False
20
Data Blogs Follow Tajamul Khan
VARIABLES
Very useful to avoid repeating subexpressions in your DAX code.
Variables can be a helpful tool for testing or debugging your DAX code
21
SCALAR FUNCTIONS
Data Blogs Follow Tajamul Khan
22
Data Blogs Follow Tajamul Khan
AGGREGATE FUNCTIONS
Functions that can be used to dynamically aggregate values within a column
23
Data Blogs Follow Tajamul Khan
ITERATOR FUNCTIONS
known as iterator functions, Iterate over the table and evaluate the expression for each row
ROUND FUNCTIONS
Functions that can be used to round values to different levels of precision
25
Data Blogs Follow Tajamul Khan
INFORMATION FUNCTIONS
Functions that can be used to analyze the data type or output of an expression
26
Data Blogs Follow Tajamul Khan
LOGICAL FUNCTIONS
Functions for returning information about values in a conditional expression
27
CALCULATE
Data Blogs Follow Tajamul Khan
CALCULATE
It is one of the most powerful and versatile functions. It allows you to modify the existing filter
context of a calculation, enabling you to perform complex calculations and aggregations.
28
Data Blogs Follow Tajamul Khan
CONTEXT TRANSITION
Context Transition is the process of turning row context into filter context
By default, calculated columns understand row context but not filter context
To create filter context at the row-level, you can use CALCULATE
29
Data Blogs Follow Tajamul Khan
EVALUATION ORDER
MODIFIERS
Modifiers are used to alter the way CALCULATE creates filter context, and are added as filter
arguments within a CALCULATE function
30
TABLE FUNCTIONS
Data Blogs Follow Tajamul Khan
31
Data Blogs Follow Tajamul Khan
ALL FUNCTION
Returns all the rows in a table, or all the values in a column, ignoring any filters
ALL is both a table filter and a CALCULATE modifier
Removes initial filter context
IGNORE FILTER
Does not accept table expressions
(only physical table references)
Returns Table
32
Data Blogs Follow Tajamul Khan
FILTER FUNCTION
Returns a filtered table, based on one or more filter expressions
FILTER is both a table function and an iterator
FILTER TABLE
Often used to reduce the number of rows to scan
Returns Table
33
Data Blogs Follow Tajamul Khan
DISTINCT
NumOfProducts =
Returns the unique values of a column COUNTROWS ( DISTINCT(
only the ones visible in the current filter context. Product[ProductCode] ))
VALUES
Returns the unique values of a column, NumOfProducts =
only the ones visible in the current filter context, COUNTROWS ( VALUES(
including the additional blank row if it is visible in the Product[ProductCode] ))
filter context.
Use DISTINCT to create new dimension table by extracting unique values from fields in data table!
34
Data Blogs Follow Tajamul Khan
VALUES will always show the blank row but DISTINCT will not
35
Data Blogs Follow Tajamul Khan
SELECTEDVALUE
SELECTEDVALUE is a convenient function that simplifies retrieving the value of a column, when
only one value is visible.
SELECTEDVALUE (
'Product Category'[Category],
"Multiple values"
)
Equivalent to:
IF ( HASONEVALUE ( 'Product
Category'[Category] ), VALUES ( 'Product
Category'[Category] ), "Multiple values" )
36
Data Blogs Follow Tajamul Khan
ALLEXCEPT
The ALLEXCEPT function in DAX is used to remove all context filters in a table except for the
filters specified in the function arguments.
37
Data Blogs Follow Tajamul Khan
ALLSELECTED
ALLSELECTED() returns all rows in a table or values in a column, ignoring inner filters i.e.,
specified in the visual but respecting other existing filter context.
38
Data Blogs Follow Tajamul Khan
39
Data Blogs Follow Tajamul Khan
40
TABLE JOINS
Data Blogs Follow Tajamul Khan
41
RELATION FUNCTIONS
Data Blogs Follow Tajamul Khan
TERMINOLOGY
PHYSICAL TABLE VS VIRTUAL TABLE
Physical relationships are manually created, and visible in your data model
Virtual relationships are temporary, and defined using DAX expressions
42
Data Blogs Follow Tajamul Khan
RELATIONSHIP FUNCTIONS
43
TIME INTELLIGENCE
Data Blogs Follow Tajamul Khan
DATE TABLE
Date Table is very important for Time Intelligence
If you import or create your own date table, it must meet these requirements:
Must contain all the days for all years represented in your fact tables
Must have at least one field set as a Date or DateTime datatype
Cannot contain duplicate dates or datetime values
If using a time component within a date column, all times must be identical (i.e. 12:00)
Should be marked as a date table (not required but a best practice)
44
Data Blogs Follow Tajamul Khan
45
Data Blogs Follow Tajamul Khan
CALENDAR
Returns a table with one column of all dates between start and end date
46
Data Blogs Follow Tajamul Khan
CALENDARAUTO()
Returns a table with one column of dates based on a fiscal year end month. The Range of dates
is calculated automatically based on data in the model.
Calendarauto(6) means it starts from 01/07
47
Data Blogs Follow Tajamul Khan
DATE FORMATTING
Use the FORMAT function to specify date/time formatting. Common examples include:
48
Data Blogs Follow Tajamul Khan
49
Data Blogs Follow Tajamul Khan
50
Data Blogs Follow Tajamul Khan
51
Data Blogs Follow Tajamul Khan
52
PERFORMANCE TUNING
Data Blogs Follow Tajamul Khan
PERFORMANCE ANALYZER
Power BI’s Performance Analyzer can help us troubleshoot issues, measure load times for
visuals/DAX queries, and optimize your code
Power BI Desktop’s Performance Analyzer records user actions (like Excel’s macro recorder),
and tracks the load time (in milliseconds) for each step in the process:
53
Data Blogs Follow Tajamul Khan
DAX STUDIO
DAX Studio is a free tool that allows you to connect to your Power BI data model to test and
optimize your DAX queries
DAX Studio
54
FREE
DATA
RESOURCES
FREE PROJECTS
FREE FREE FREE
MACHINE EDA STATISTICS
LEARNING PROJECTS PROJECTS
PROJECTS
Tajamul Khan