A paginated response is a technique used in REST APIs to manage and deliver large datasets efficiently. Rather than returning the entire dataset in one go, the API divides the data into smaller chunks or “pages.” Each API call retrieves one page of data, along with metadata that informs the client about how to fetch subsequent pages. This approach improves the performance and responsiveness of applications by reducing the load on the server and the client. Pagination is commonly used in web applications, especially when displaying lists of data such as search results, user records, logs, or financial data.
Paginated responses typically include not only the requested data for a specific page but also metadata about the dataset’s overall structure. This metadata often consists of the current page number, total number of available pages, number of entries per page, total count of entries, and URLs for the next or previous pages. This metadata helps developers and users to navigate through the dataset efficiently.
Why Pagination is Necessary in APIs
The need for pagination arises from the limitations in bandwidth, memory, and processing power that both client and server systems face. If a REST API were to return an entire dataset containing tens or hundreds of thousands of records in a single response, several issues would arise. The server would have to allocate considerable resources to gather and transmit the data. At the same time, the client would have to allocate enough memory to receive, parse, and render this data. This exchange could significantly slow down both systems and reduce the overall performance of the application.
By introducing pagination, developers can ensure that the client application fetches only the data it needs at any given moment. Pagination allows users to request data incrementally, often triggered by scrolling, clicking a ‘next’ button, or specifying a page number. This leads to a more manageable, scalable, and responsive system. In systems where new data is added frequently, pagination also allows the application to stay updated more efficiently, as only a portion of the dataset needs to be refreshed or retrieved again.
Types of Pagination
There are different approaches to implementing pagination in REST APIs, depending on the structure of the data and the system’s performance goals. The most common types include offset-based pagination, cursor-based pagination, and keyset pagination.
Offset-based pagination is the most widely used method and involves specifying a page number and a limit (number of items per page). For example, a request might look like GET /items?page=3&limit=20, which instructs the server to return the third page of results with 20 items per page.
Cursor-based pagination, often used in systems with rapidly changing data, uses a unique identifier (usually a timestamp or unique ID) to mark the position in the dataset. This method is more efficient for large datasets because it does not rely on calculating offsets, which can be resource-intensive in databases.
Keyset pagination is a more advanced method used to paginate based on a column value such as an ID or date. It is particularly useful in avoiding the skipping of records when data changes between paginated requests.
Each of these methods comes with trade-offs in terms of complexity, accuracy, and performance. The choice of pagination method depends on the specific use case, data structure, and system requirements.
Pagination Metadata in REST APIs
Pagination metadata is essential for navigating a paginated response. When a client sends a request for a specific page of data, the response often includes additional information that guides the client on how to fetch other pages. This metadata varies based on the API design but generally includes several common elements.
The current page number indicates the specific page that the server is returning in the response. This helps the client keep track of the position in the dataset.
The total number of pages is another important piece of metadata that informs the client about how many more pages of data are available. This information is useful for rendering navigation controls such as page numbers or next/previous buttons.
The next page URL is a key part of the metadata that enables the client to fetch the subsequent page without recalculating the page number or constructing a new URL manually. Similarly, APIs might also provide previous page URLs or links to the first and last pages.
In some APIs, the metadata may also include the total count of records in the dataset, the number of items per page, or the index of the first and last item on the current page. All these pieces of information contribute to an improved and efficient client-side implementation.
APIs typically include this metadata in a separate object or wrapper around the data itself. This separation ensures that the client can handle both the data and the navigation logic without confusion. In JSON responses, for example, a common structure might include data, meta, and links objects. The data array contains the actual records, while the meta object holds summary information, and the links object provides URLs for navigating pages.
Advantages of Pagination
Pagination provides several practical benefits for both API providers and consumers. These advantages range from performance improvements to better user experiences.
Performance optimization is one of the key benefits of pagination. By returning only a portion of the data at a time, the server reduces the memory and CPU resources required to process and respond to requests. It also speeds up the delivery of data, especially over slower networks, since smaller payloads are transmitted.
Managing server load is another advantage. Servers that handle thousands of users or queries simultaneously can become overwhelmed if forced to return large datasets in a single response. Pagination helps balance the load by spreading data retrieval across multiple requests.
Improved user experience is also a major benefit. When users interact with applications that load data gradually—such as through infinite scrolling or page navigation—they perceive the application as faster and more responsive. Users are not forced to wait for all data to load before interacting with the interface, leading to a smoother experience.
Memory efficiency on the client side is improved as well. Devices with limited memory, such as smartphones or tablets, benefit significantly from paginated data. Instead of trying to store and render an entire dataset, the application only processes the data currently being displayed.
Reduced network traffic is another benefit. In distributed or mobile environments, minimizing the volume of data transferred helps conserve bandwidth and lowers data usage. This is particularly important in areas with limited internet access or costly data plans.
Together, these advantages make pagination a critical feature for scalable, user-friendly, and performant API designs. Well-implemented pagination can significantly enhance both backend performance and frontend usability.
Common Pagination Practices and Standards
While pagination logic may vary across systems, some practices have become common standards in API development. These include using clear query parameters, returning consistent metadata, and supporting multiple pagination strategies.
Clear query parameters such as page, limit, offset, or cursor help users and developers understand how to retrieve paginated data. These parameters should be documented clearly and behave consistently across endpoints.
Returning consistent metadata and navigation links in responses makes it easier for developers to automate pagination. For example, including next, prev, first, and last URLs in the response helps tools like Power BI or custom scripts handle pagination without requiring manual construction of URLs.
Supporting both offset-based and cursor-based pagination can also be useful, especially for APIs expected to handle both small and large datasets. Offset-based pagination works well for stable datasets, while cursor-based pagination handles high-churn or real-time data more effectively.
In some cases, APIs also include headers to convey pagination information instead of embedding it in the response body. While this method is less common, it can simplify the structure of the JSON response. However, it requires clients to parse headers carefully, which may add complexity.
Designing pagination carefully with flexibility and scalability in mind ensures that the API remains useful and performant over time. Developers should also consider how changes in data might affect pagination, such as deleted records, inserts, or updates that may shift data positions between requests.
Querying Paginated REST APIs with Power Query
Introduction to Power Query
Power Query is a data transformation and connection tool that is widely used within data analytics platforms such as Power BI and Excel. It allows users to extract, transform, and load (ETL) data from a wide range of sources. One of its key features is the ability to work with APIs, particularly RESTful APIs that provide structured data over HTTP. With Power Query, users can call an API, transform the JSON response, clean the data, and load it into visualizations or reports.
When dealing with APIs that return paginated data, Power Query proves highly effective. Instead of fetching all data manually or stitching together multiple calls by hand, users can build reusable queries and functions that automate the paging process. This automation is critical when the API provides thousands or millions of records that cannot be retrieved in a single call due to performance or design limitations.
Power Query and REST API Pagination
REST APIs often return paginated results to prevent overwhelming the client or server with large data responses. These APIs typically support query parameters such as page, limit, offset, or cursor. When using Power Query to access such APIs, the process involves creating a query that dynamically modifies these parameters and aggregates the resulting pages into one unified table.
Power Query works with the M language, which supports looping, recursion, and dynamic function creation. These features allow developers to define a paging function that makes repeated requests to the API, each time with a different page number, and combines all the returned data. Depending on the API’s metadata, the function can iterate until all pages are fetched.
One important aspect of using Power Query with paginated APIs is understanding the API structure. Before building a paging function, it is necessary to inspect the JSON response to determine where the data is located, how pages are defined, and whether any metadata such as total page count or next page links are included. This inspection guides the development of the paging function and helps ensure that the query retrieves all required data without errors.
Creating a Custom Paging Function
To demonstrate how Power Query handles pagination, consider building a function that retrieves exchange rate data from a publicly available API. The API returns paginated JSON data, including metadata such as total pages, record count, and links. Using this structure, we can create a custom Power Query function that takes a page number as input and returns the data from that specific page.
Begin by launching Power BI or Excel and opening the Power Query Editor. From there, you can create a blank query and use the Advanced Editor to write the custom function. This function is designed to dynamically generate a URL that includes the page number and then extract the relevant data from the response.
In this example, a function named GetPage is defined. This function accepts a page number as a text value, constructs the full URL using that page number, and fetches the data from the API. The returned JSON data is parsed, and only the necessary fields are extracted and converted into a table.
This approach allows Power Query to repeatedly call the API with different page numbers, creating a loop that aggregates data from multiple pages. By transforming and cleaning the data within the function, the resulting tables are immediately ready for analysis or loading into the data model.
The following function provides the foundation for fetching each page:
m
CopyEdit
(page as text) =>
let
Source = try
Json.Document(
Web.Contents(
“https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?page[number]=”
& page
& “&fields=country_currency_desc,exchange_rate,record_date”
)
)
otherwise
[data = null, meta = null, links = null],
#”Converted to Table” = Table.FromRecords({Source}),
#”Removed Columns” = Table.RemoveColumns(#”Converted to Table”, {“meta”, “links”}),
#”Expanded data List” = Table.ExpandListColumn(#”Removed Columns”, “data”),
#”Expanded data Records” = Table.ExpandRecordColumn(
#”Expanded data List”,
“data”,
{“country_currency_desc”, “exchange_rate”, “record_date”},
{“data.country_currency_desc”, “data.exchange_rate”, “data.record_date”}
)
in
#”Expanded data Records”
This function performs several tasks. First, it attempts to call the API and parse the JSON response. If the request fails, the function gracefully handles the error by returning empty or null values. Then it converts the JSON structure into a tabular format. Non-essential elements such as metadata and links are removed. The nested data list is expanded to reveal each individual record, and those records are further expanded into fields for easier analysis.
Once this function is created and saved under the name GetPage, it can be reused in the main query to retrieve all the pages by looping through the total page count or until no more data is returned.
Exploring Metadata for Pagination
The API used in the example includes metadata that helps control pagination. By inspecting the initial API response, you can see fields such as total-count, count, total-pages, and navigational links. These fields provide valuable information that can be used to build a dynamic list of pages to retrieve.
For instance, if the metadata includes a total-pages field, it becomes straightforward to create a list from 1 to the total number of pages and apply the GetPage function to each entry in that list. This process uses Power Query’s ability to transform lists into tables and invoke functions over each row. The result is a unified table containing data from all pages.
In other cases, the metadata might include only the total count of records and the number of records per page. From this, the total number of pages can be calculated using mathematical functions. In scenarios where neither page count nor entry count is available, more advanced logic such as conditional looping with List.Generate is required. This method involves calling the function repeatedly until an empty or null result is returned, indicating that no more data is available.
Understanding and leveraging metadata is essential for efficient paging. It ensures that the query retrieves the correct amount of data and avoids unnecessary or failed requests. Proper handling of pagination metadata also contributes to cleaner and more maintainable code.
Benefits of Using Power Query for Paginated APIs
Using Power Query to work with paginated REST APIs offers a number of benefits. First, it allows for reusable and parameterized queries that simplify data loading processes. By defining a single paging function, data from any page can be retrieved without rewriting code or modifying URLs manually.
Second, Power Query’s visual interface helps users understand and track the steps involved in data transformation. Each step of the paging and extraction process is visible in the editor and can be reviewed or adjusted at any time. This transparency is helpful for both new users and experienced analysts.
Third, Power Query supports automatic data refresh and scheduling when used with platforms like Power BI. This means that once a paging query is set up, it can retrieve updated data from the API at scheduled intervals without manual intervention.
Finally, Power Query’s flexibility in handling JSON structures, applying transformations, and combining multiple sources makes it ideal for working with modern REST APIs. It simplifies otherwise complex processes and integrates seamlessly with the broader data analysis workflow.
Methods to Retrieve Pages from Paginated APIs Using Power Query
Handling Pagination When Total Page Count Is Known
When working with a REST API that provides the total number of pages in its metadata, Power Query can be used to generate a list of page numbers and fetch data from each one sequentially. This is the simplest and most direct pagination method because all required information is explicitly provided by the API.
To implement this method, the process begins by inspecting the first response from the API. This response includes a meta section containing a field such as total-pages. With that number, you can build a list of integers from 1 to total-pages. Then, using Power Query, you invoke the custom function GetPage (explained in the previous part) on each page number in the list.
In Power Query, this can be done by writing a query that performs the following steps:
- Fetch the first page of the API response to get metadata
- Read the total number of pages
- Generate a list from 1 to the total number of pages
- Transform this list into a table
- Call the GetPage function for each page number
- Expand the results into a single table
This method is reliable and efficient, especially for APIs that maintain consistent pagination metadata. It avoids over-fetching or making unnecessary requests and ensures all available records are collected.
Here is the Power Query M code that demonstrates this approach:
m
CopyEdit
let
Source = try
Json.Document(
Web.Contents(
“https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date”
)
)
otherwise
[data = null, meta = null, links = null],
PageList =
if Source = null or Source[meta] = null then
{null}
else
{1 .. Source[meta][#”total-pages”]},
PageTable = Table.FromList(PageList, Splitter.SplitByNothing()),
Renamed = Table.RenameColumns(PageTable, {{“Column1”, “PAGES”}}),
ChangedType = Table.TransformColumnTypes(Renamed, {{“PAGES”, type text}}),
InvokedFunction = Table.AddColumn(
ChangedType,
“GetPage”,
each if [PAGES] <> null then GetPage([PAGES]) else null
),
Expanded = Table.ExpandTableColumn(
InvokedFunction,
“GetPage”,
{“data.country_currency_desc”, “data.exchange_rate”, “data.record_date”},
{“Country”, “ExchangeRate”, “RecordDate”}
)
in
Expanded
This code builds a reliable foundation for retrieving all pages when the page count is known. The result is a complete dataset, accurately combined from all individual pages.
Handling Pagination When Entry Count Is Known but Page Count Is Unknown
Some APIs may not directly provide the number of pages but do include the total number of records (total-count) and the number of records per page (count). In such cases, the total number of pages can be calculated by dividing the total entries by the number of entries per page. This calculated value can then be used to loop through all page numbers.
This method still requires an initial API call to read the meta section and retrieve both total-count and count. After the total number of pages is derived using division and rounding, the process of generating a page list and calling GetPage remains the same as before.
Here is the Power Query M code used for this case:
m
CopyEdit
let
Source = try
Json.Document(
Web.Contents(
“https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date”
)
)
otherwise
[data = null, meta = null, links = null],
PageList =
if Source = null or Source[meta] = null then
{null}
else
{1 .. Number.RoundUp(Source[meta][#”total-count”] / Source[meta][#”count”])},
PageTable = Table.FromList(PageList, Splitter.SplitByNothing()),
Renamed = Table.RenameColumns(PageTable, {{“Column1”, “PAGES”}}),
ChangedType = Table.TransformColumnTypes(Renamed, {{“PAGES”, type text}}),
InvokedFunction = Table.AddColumn(
ChangedType,
“GetPage”,
each if [PAGES] <> null then GetPage([PAGES]) else null
),
Expanded = Table.ExpandTableColumn(
InvokedFunction,
“GetPage”,
{“data.country_currency_desc”, “data.exchange_rate”, “data.record_date”},
{“Country”, “ExchangeRate”, “RecordDate”}
)
in
Expanded
This method is especially useful when APIs are designed to return the total number of records but omit the number of pages. Calculating the number of pages ensures that the loop terminates at the right moment, avoiding extra or missing data.
Handling Pagination When Both Entry Count and Page Count Are Unknown
The most complex scenario occurs when neither the number of entries nor the total number of pages is provided by the API. In such cases, Power Query must determine on its own when to stop requesting additional pages. This requires dynamically checking the response of each page and continuing to the next only if valid data is returned.
To achieve this, Power Query uses List.Generate, a function that creates a list by repeating an operation until a condition is met. In this context, List.Generate will:
- Start with the first page
- Call the GetPage function
- Check whether the result is empty
- If it is not empty, increment the page number and repeat
- Stop when the result is empty, indicating there are no more pages
This approach ensures that only valid pages are retrieved, and no assumptions are made about page count or entry count. It is ideal for APIs that do not include pagination metadata and rely on implicit structures.
Here is an implementation using Power Query M:
m
CopyEdit
let
Source = List.Generate(
() => [PAGE = 1, RESULTS = GetPage(“1”)],
each Table.RowCount(
Table.SelectRows(_[RESULTS], each not List.Contains(Record.ToList(_), null))
) > 0,
each [PAGE = _[PAGE] + 1, RESULTS = GetPage(Number.ToText([PAGE] + 1))],
each _[[PAGE], [RESULTS]]
),
PageTable = Table.FromList(Source, Splitter.SplitByNothing()),
ExpandedRecords = Table.ExpandRecordColumn(PageTable, “Column1”, {“PAGE”, “RESULTS”}),
ExpandedResults = Table.ExpandTableColumn(
ExpandedRecords,
“RESULTS”,
{“data.country_currency_desc”, “data.exchange_rate”, “data.record_date”},
{“Country”, “ExchangeRate”, “RecordDate”}
)
in
ExpandedResults
This method requires more computational effort but is extremely flexible. It continues requesting pages as long as data is returned. When the response is empty or includes only nulls, the loop stops. This allows for fully dynamic pagination handling in Power Query.
Choosing the Right Pagination Strategy
The strategy you choose depends entirely on the structure and behavior of the API you are working with. When the page count is explicitly given, a static range-based approach is both simple and efficient. When only record counts are known, you can still create a reliable loop. When no metadata is available, List.Generate gives you control and flexibility.
To summarize:
- Use a page range (from 1 to total-pages) when total page count is available.
- Calculate page range based on total-count / count when entry count is available.
- Use List.Generate when no metadata is available and pages must be discovered dynamically.
By mastering these techniques, you can reliably extract data from virtually any paginated REST API using Power Query.
Optimizing Performance and Managing Large Paginated Responses in Power Query
Performance Optimization Using Pagination
When working with RESTful APIs, especially those that return large datasets, performance becomes a significant concern. If an API returns thousands or even millions of records in a single response, it can overwhelm both client and server resources. Pagination addresses this issue by breaking the dataset into smaller chunks and sending only a limited number of records per request.
This division into pages leads to substantial performance benefits:
- Smaller payloads mean faster transmission over the network.
- Reduced memory usage per request allows client systems (like Power BI or Excel) to function smoothly.
- Server-side throttling and timeouts are less likely because each API call is light and efficient.
- Systems with constrained bandwidth or older hardware can still process paginated data reliably.
The benefit of pagination becomes especially visible in Power Query when combining multiple data sources or performing transformations. It allows users to load data incrementally, ensuring that transformations and filters are applied progressively without loading the entire dataset into memory at once.
By leveraging techniques such as caching, query folding, and column-level transformations within each page, Power Query can improve speed and responsiveness, even when processing thousands of pages.
Efficient Memory Usage with Power Query
Memory consumption is one of the most important factors when dealing with API data in Power Query. If the entire dataset is loaded in one shot, it can lead to out-of-memory errors or slow query performance, particularly in environments like Excel or Power BI Desktop.
Pagination improves memory handling in several ways:
- Only a subset of data is loaded at a time.
- Power Query can discard pages once they are processed and appended to the final table.
- When using dynamic paging (as shown with List.Generate), the tool only retains active memory for the current page being processed.
Power Query’s lazy evaluation model complements pagination. It ensures that pages are fetched and processed only when required, rather than loading all data into memory up front. This is particularly helpful for large or streaming datasets where full data access is unnecessary at the time of query design.
Users can further improve memory efficiency by applying filters early in the query process. For example, if only records after a certain date are needed, adding a date filter in the GetPage function limits the data that must be retrieved and processed, reducing the load on both memory and processing time.
Another good practice is to limit the number of columns returned. Rather than expanding all fields, select only the necessary columns using the fields parameter of the API or by applying a Table.SelectColumns transformation within Power Query.
Reducing Network Traffic Through Smart Pagination
Pagination not only helps with memory and performance, but also plays a critical role in reducing network traffic. By default, a REST API might return a large number of records per page, which can result in high data transfer volumes. Many APIs offer query parameters to control the page size, such as page[size]=50.
By choosing an appropriate page size, users can fine-tune the balance between network usage and performance. For example:
- A smaller page size (10–50 records) reduces the size of each request and improves responsiveness.
- A moderate page size (100–250 records) balances efficiency with performance for most use cases.
- A large page size (500+ records) should be used cautiously, only when system resources are sufficient.
Using parameters to dynamically control page size also makes queries more adaptable. In Power Query, you can define a pageSize variable and pass it to your API endpoint dynamically. This provides flexibility when experimenting with different performance profiles, depending on the volume and structure of your data.
Additionally, network traffic can be reduced by using caching mechanisms or conditional queries. Power BI automatically caches results between refreshes if queries are not altered. Users can also use custom logic to skip repeated data fetches, like checking for existing records or maintaining incremental refresh logic based on dates or keys.
Best Practices for Working with Paginated APIs in Power Query
Working with paginated data in Power Query requires a balance of technical strategy and resource management. Following best practices ensures scalability, efficiency, and accuracy:
Apply Filters Early
Always filter data as close to the source as possible. If the API supports filtering via query parameters (such as by date, status, or country), use them. Filtering early reduces the amount of data transmitted and processed.
For example, modify your API URL like this:
bash
CopyEdit
This returns only the data needed, saving time and bandwidth.
Minimize Data Columns
Request only the fields you need using the fields parameter in the URL or by selecting specific columns in Power Query. Reducing the number of fields drastically improves performance and clarity during analysis.
For example:
CopyEdit
…&fields=country,exchange_rate,record_date
In Power Query, use Table.SelectColumns to trim down the result set to what is truly required.
Use Robust Error Handling
APIs may fail due to timeouts, invalid responses, or data structure changes. Add error handling using the try…otherwise clause in Power Query. This ensures that your query doesn’t break entirely if one page fails.
m
CopyEdit
Source = try Json.Document(…) otherwise [data = null]
Logging failed pages or adding fallback logic can make queries more resilient, especially for critical dashboards.
Use Parameters for Flexibility
Define Power Query parameters for:
- Page size
- API base URL
- Date range
- Fields
This makes your queries modular, reusable, and easier to maintain. It also improves collaboration, as other users can update values without editing the main query logic.
Avoid Hardcoding Page Limits
When possible, avoid hardcoding the total number of pages. Instead, retrieve metadata dynamically or use the List.Generate approach. This ensures your query adapts to changes in data size without requiring manual updates.
Monitor Query Execution Time
For large datasets, monitor how long your query takes to run. If execution time grows beyond acceptable thresholds:
- Reduce page size
- Filter data more aggressively
- Move logic upstream to the API if it supports server-side aggregation or filtering
Document Your Query Logic
Use comments in Power Query (lines beginning with //) to describe what each part of your query does. This improves readability and makes it easier for others (or your future self) to understand and maintain the solution.
Perform Transformations After Expansion
Delay transformations such as filtering, sorting, and joins until after data has been expanded from pages. Applying logic too early within each page can increase complexity and memory usage. Instead, collect all data first, then process it in a final table.
Conclusion
Efficiently handling paginated data in Power Query requires more than just fetching multiple pages. It involves using the right strategy based on what metadata the API provides, managing memory consumption, minimizing network usage, and following clean query design principles.
Pagination transforms how we interact with APIs by enabling scalability and reliability. By choosing the correct method—whether known page count, calculated page count, or dynamic fetching—you can confidently work with datasets of any size.
Power Query, when used with these best practices, becomes a robust tool for data professionals working with RESTful APIs, enabling seamless integration, high performance, and reliable automation.