Implementing Pagination with MuleSoft
There could be a situation, in which we are consuming an API and the result is returned in multiple pages. Your business case needs the Mule Application to have a capability to process all results from all pages and return the response as per specification. Therefore, we are presented with a business problem to resolve – how to process n number of pages and return the combined response back to the calling client.
As per design your backend API only returns x number of records per page and depending on the total number of records it could be n number of pages. So, as a consumer of that backend system and gateway to the client, you are responsible to maintain pagination.
Let’s talk about how we resolve this problem effectively using MuleSoft. One simple solution to this problem is implementing a recursive flow that calls the backend system API n number of times until all records are fetched and no new pages exist.
Pros: Main advantage of this approach is the simplicity and also that it’s easier to implement.
Synchronous approach. Hence, the next page cannot be fetched until the processing of prior pages are completed.
Response would be slow.
Main disadvantage is, if we have many records and if you need to iterate too many times, Mule Runtime might crash with “Too many child contexts” error.
An alternative solution to this problem is to implement this in a more complex way and asynchronous processing could be a saviour here.
To demonstrate this approach, we will be calling the GitHub API to return all the repositories under a specific user/organization. There are a few steps of authentication before we can call the GitHub API to get all the results.
Authentication for GitHub API
There are two ways to authenticate through the GitHub REST API.
1) Basic authentication
$ curl -u "username" https://api.github.com
2) OAuth2 token (sent in a header)
$ curl -H "Authorization: token OAUTH-TOKEN" https://api.github.com
The recommended way of authentication is OAuth tokens using Authorization Header.
As a part of this implementation, we will be doing authorization as OAuth using a Personal Access Token. Personal Access Tokens can be generated for each user and can be sent as Authorization header.
If you want to know more details about how to create a Personal Access Token for your GitHub account, you can follow this link.
Main implementation involves two steps:
Step 1: Initial step is to get the results for the first page. This is handled by the GetFirstPageResponse sub-flow which essentially calls the GitHub API with page number as 1.
Step 2: Once we have page 1 result, we will call ProcessPageResults sub-flow to fetch subsequent pages and appending the results of all pages.
Step 1: Getting First Page from GitHub API
Getting the first page is straight forward. We set up essential variables like pageNumber which will facilitate calling the GitHub API. Then, we invoke the GitHub API to get the results of the first page. For more details about how to call the GitHub API, you can follow the section “Invoking GitHub API.” Next step is storing this result in an Object Store. We store the results of first page in Object Store to smoothly append the results of all pages at the end of the process which we will be seeing in a moment.
Step 2: Processing Subsequent Pages
From the previous section, we have fetched the first page of GitHub API, now it’s our turn to process the results from other pages. In this section, we will be discussing about processing of subsequent pages asynchronously.