How to Count Unique Rows with Power Query

Bottom Line: Use Power Query to count distinct rows in a data set that contains multiple rows per transaction, invoice, etc.

Skill Level: Intermediate

Video Tutorial

Watch on YoutubeSubscribe to our Channel

Download the Excel File

You can download the example file from the video here. I've included both the original file and the file with the solution:

Data Analysis Challenge.xlsx (16.1 KB)

Data Analysis Challenge - Power Query FINAL.xlsx (30.9 KB)

Solving the Data Analysis Challenge

This post and video are the second in a series featuring solutions to a challenge I posed to readers. The inspiration for this data analysis challenge came from Rob, one of our Elevate Excel Training members.

The challenge is to create a summary report of deal count by sales stage. The issue is that there are multiple rows for each deal (transaction) in the source data. So we need to find a way to just count the unique rows for each deal.

Data Analysis Challenge - Count Unique Rows in Excel for Summary Report

To see the first set of solutions to the challenge, using Excel Tables and Power Pivot, watch this tutorial: 2 Ways to Calculate Distinct Count with Pivot Tables. Today, we're going to solve the same challenge using Power Query.

A quick shout-out of thanks to everyone who submitted a solution to the challenge! Excel Campus viewers are the best!

Use Power Query to Get a Distinct Count

Let me start by noting that Power Query is available only to those who are running Excel 2010 or later for Windows. If that's not you, I suggest using the pivot table solution I mentioned above.

For the Power Query solution today, the first step is to format the data as an Excel Table.

To do that, just select any cell in the data set, and click on Format as Table on the Home tab. Right-click on the table format you want and select Apply and Clear Formatting.

Format Data as an Excel Table

Hit OK when the Format as Table window appears.

Now that the data is formatted as a table, you can go to the Data tab and click on the From Table/Range button in the Get & Transform Data section.

From Table Range Power Query

Note: On Excel 2010 and 2013, Power Query is on it's own tab. Please see my post on how to install Power Query if you are on one of those versions of Excel.

This opens up the Power Query Editor, loading a preview of our data. From here, we really only need to do two things:

1. Remove Duplicates

Since we want to remove the duplicates in our data set based on both the Deal ID and Sales Stage columns, we will select both columns before getting rid of the duplicates. With both columns selected, you can right-click on either of them and choose Remove Duplicates.

Right-click to remove duplicates

When multiple columns are selected, Remove Duplicates will look for duplicates by combining the values in each row from the selected columns. The result is a list of unique combinations of the Deal ID and Sales Stage columns.

Note: Technically we don't need to select two columns because the Sales Stage is always the same for each Deal ID in this example. Therefore, you can select the Deal ID only and Remove Duplicates. However, you might have a scenario where there are different stages per Deal, and in that case you would want to select both columns.

2. Find the Count

From the remaining data in our data set, we want to count up each entry in the Sales Stage column. To do this, we can use the Group By feature in Power Query. With just the Sales Stage column selected, click on Group By, found on the Home tab in the Ribbon.

In the Group By window, we can use the Basic settings that are already there by default.

Group By window
Click to enlarge

That essentially leaves us with the report we are looking for because Power Query has added the number of entries for each Sale Stage and presented them as a count in a separate column.

Power Query results of Group By feature

If you click on the top half of the Close & Load button, it will export that new report as its own new worksheet in the workbook.

End result table

Pros and Cons for Using Power Query

One great thing about using Power Query is that as we add or delete source data entries, we can refresh the output table ( Alt + F5 ) to include those changes.

Another advantage of using Power Query is that your source data does not have to be an Excel worksheet. You can import data to Power Query from a database, website, .CSV file, or other source.

A disadvantage to this method is that you lose the ability to filter down further by other columns, such as Product. For that type of flexibility, the pivot table method is better.

Conclusion

In my next post, we will take a look at solving the data analysis challenge using Excel's new dynamic array formulas, which are always fun to work with. Until then, let me know if you have any questions about this solution that uses Power Query by leaving a note in the comment section below.

Search
Generic filters
Exact matches only

JOIN US & LEARN EXCEL

Learn 10 great Excel techniques that will wow your boss and make your co-workers say, "how did you do that??"
Plus weekly updates to help you learn Excel.

Download the eBook

Join Our Weekly Newsletter

The Excel Pro Tips Newsletter is packed with tips & techniques to help you master Excel.

Join Our Free Newsletter

About Me

Jon Acampora Profile

Hello and welcome! My name is Jon Acampora and I'm here to help you learn Excel.

This blog is updated frequently with Excel and VBA tutorials & tools to help improve your Excel skills and save time with your everyday tasks. Subscribe above to stay updated. More about me...

MVP_Horizontal_BlueOnly

Free Excel Training Webinar Modern Power Tools

>