Implementing Paging for SharePoint in Power Automate

April 26, 2023
7 min read

One of the many benefits of using Microsoft Power Automate is the ability to process large amounts of data in batches. By automating such tasks, you reduce the risk and effort associated with performing them manually and can focus on more impactful tasks. Many connectors in Power Automate are set up to process large amounts, but some are limited by the APIs they use. In this article, I’ll use an example to show how can you overcome this limitation when you work with large Microsoft SharePoint datasets.

SharePoint Connector

One of the first connectors that was released with Power Automate was for SharePoint. It is no surprise because SharePoint is one of the most popular and heavily used services within the Microsoft 365 ecosystem. The SharePoint connector has paging enabled for many of its actions via the Settings menu, as shown in Figure 1.

Image of Settings screen for Get files. To allow retrieval of large data sets, pagination is enabled, and the threshold is set to 100,000.
Figure 1: Enabling pagination in SharePoint enables you to retrieve large amounts of data. | Image: Haniel Croitoru/Microsoft.

However, in some cases, you may need to leverage the Send an HTTP Request action from the SharePoint connector. This action enables you to retrieve more information than the standard actions do, but lacks pagination functionality.

Case Scenario – Auditing OneDrive for Business

To better understand this scenario, see the Auditing Your OneDrive for Business Sharing Using Power Platform (tekkigurus.com) article that I recently published. In that scenario, I needed to scan entire OneDrive for Business sites, which can have tens to hundreds of thousands of documents in them. On a sample OneDrive that contains around 36,000 files, the time to scan all the files and find users with whom files are shared was around 18 hours. Of those files, only 185 had custom permissions. Table 1 shows the overall process:

Action                  Description                                        Times Run                            Duration
Get Files Iterate through all the files in OneDrive for Business using the Get Files action.
Use the built-in pagination shown in Figure 1.
 Once explicitly, but it was implicitly called several times to get batches of 5,000 files ~5 min
       
Apply to each Check each file to see if it has custom permissions. ~36,000 ~18 hours
       
Collect information If file has custom permissions, collect information about it. Only files that were shared ~18 hours

Table 1: Tasks and durations for original implementation of OneDrive for Business scanning for custom permissions.

This process worked and falls within the 30-day Power Automate flow execution limit. However, it could only be optimized if the files with custom permissions could be retrieved.

SharePoint REST Call and Pagination

As mentioned earlier, Send an HTTP Request offers a lot of flexibility. You can pick specific properties to retrieve, to reduce the amount of information that needs to be retrieved. You can even collect information that is not available by using the built-in actions. The following REST call retrieves OneDrive for Business shared files only.

 

https://<tenant>-my.sharepoint.com/personal/<user email>/_api/web/lists/getbytitle ('Documents')/ items?
$select=FileLeafRef,HasUniqueRoleAssignments,ID,SharedWith,Title,SharedWithDetails,FileRef,FileDirRef,ServerUrl,DisplayName &
$top=1000&$skiptoken=Paged=TRUE%26p_ID=0

 

The first line indicates that files are to be retrieved from a person’s OneDrive for Business site from the Documents library. 

The second line indicates the specific properties to collect. In our case, HasUniqueRoleAssignments is the key property. It is set to true if a file has custom permissions and false otherwise. If only these two rows are used, this REST API call will return the first 100 files. 

By adding on the third line, you specify that you want to get 1000 items from this call. The p_ID specifies the ID of the first items. By increasing the p_ID parameter, you can get batches of different files. So, in the first call, you would use p_ID=0. The next time, you would start with p_ID=1000 (assuming the top count is set to 1000), etc.

Putting It All Together

Now that you know what it takes to use the REST API call, you can build a flow to collect the information. The flow uses four variables and a loop. Note that in my flow, I renamed the Send an HTTP Request action to Get Files for better readability of the flow.

Image of Power Automate Flow showing the actions needed to create the pagination logic for Send an HTTP Request action.
Figure 2: Overview of flow to paginate through a set of files using the Send an HTTP Request action. | Image: Haniel Croitoru/Microsoft.

The variables are as follows:

Variable                    Purpose                  Type                  Initial Value
TopCount Number of files to return in each Send an HTTP Request call Integer 1000
       
p_ID ID of the first file in a set of files being retrieved Integer 0
       
Results Array of all the file properties I’m retrieving Array []
       
ResultsCount Count of number of items retrieved in single loop execution; initially set to 1 so the loop will run at least once Integer 1

Table 2: Variables for building the flow.

Inside the loop, the key operations are to retrieve file properties, append the retrieved file properties to the existing Results array, and then increment the various counters, as shown in Figure 3.

Image of Power Automate Flow showing the actions inside the loop for the pagination logic.
Figure 3: Overview of loop inside the flow. | Image: Haniel Croitoru/Microsoft. View Full Size

Do Until

The loop has been set up to run until the ResultsCount is 0. This means that once Get Files (in the next section) stops retrieving values, the execution ends. As described earlier, ResultsCount is initially set to 1 so that the loop will run at least once.

Get Files

This action, as described earlier, does all the heavy lifting, whereby it retrieves the file properties. The TopCount and p_ID values have been replaced by the variables defined. The TopCount is a constant that I set to 1000. The p_ID is initially set to 0.

Unify the Results with the Previous Collected Data

This is a Compose action, which again I renamed for better readability. The action uses the union() PowerFX expression to take the existing Results array values and combine them with the results from the Get Files action. The expression is as follows:

union(body('Get_Files')?['d']?['results'],variables('Results'))

The reason for using a Compose action and not assigning the outcome of this expression directly to the Results variable is that in Power Automate flows it is not possible to reference a variable in an expression that modifies it. In other words, Results cannot be referenced in the action that modifies Results. The final three actions described below can be running in any order because there is no dependency between them.

Increment p_ID by TopCount

As the name suggests, this integer is incremented by 1000 in each iteration of the loop. In the last iteration of the loop, the value of p_ID will be the nearest 1000 below the actual number of files. If there are 3999 files, then the last p_ID will be 3000.

Set ResultsCount

The variable gets set to the number of items that the Get Files action retrieves within an execution of a loop. The number is calculated using the PowerFX expression.

length(body('Get_Files')?['d']?['results'])

Set the Unified Results

The results of the concatenated arrays (from the Compose action) are assigned to the Results and ultimately returned for processing.

Revisiting the Case Scenario – Auditing OneDrive for Business

With the new workflow, I saw some tremendous speed improvements. The overall execution went from around 18 hours down to just over 17 minutes. This is largely because I let the internals of the SharePoint API do all the heavy lifting by filtering out all the files that didn’t have custom permissions and returning only the 185 files needed. This is an example of how an understanding of the APIs that are wrapped in the connectors can help optimize performance.

Haniel Croitoru

Haniel Croitoru

Haniel Croitoru is an enterprise architect and Microsoft MVP with over 20 years of experience in Microsoft 365. Since 2003, Haniel has been focusing on delivering solutions to meet short-term and long-term business goals using Microsoft 365 workloads such as SharePoint, Teams, Power Platform and more. 

In addition to Haniel’s professional tenure, he has always been a big proponent of sharing knowledge and giving back to the community through presenting at numerous conferences and networking meetings on the topics of utilizing Microsoft 365 for business process optimization, effective collaboration and communication, and managing projects using the platform. 

Prior to entering the Microsoft 365 arena, Haniel spent several years in the medical imaging industry where he helped launch an orthopedic software division and published four patents and numerous articles. 

Haniel holds a Master of Science in Computer Science with a specialty in Computer-Assisted Orthopedic Surgery from Queen’s University and a Master’s Certificate in Project Management from the York Schulich School of Business. He is a PMI-certified Project Management Professional (PMP) and Agile Certified Practitioner (PMI-ACP).