內容分頁顯示
分頁顯示功能可讓您將大量資料切分為更小的片段以利管理。
簡介
Rapid 內容 API 允許使用大量旅宿資料。由於此資料量龐大,因此內容 API 支援使用分頁顯示,將資料切分為更小的片段以利管理。本文件透過幾個範例與最佳作法,說明分頁顯示功能的使用方式。
基本範例
分頁顯示流程首先會搜尋旅宿,並取得超出單個頁面所能容納的多筆結果。出現這種情況時,回應會先顯示第一頁結果,然後會提供一個 Link
回應標頭,供使用者點選瀏覽下一頁。
請求範例:
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia
回應標頭範例:
Link: <https://api.ean.com/v3/properties/content?token=WVZTCUNRUQ4SUSNHAXIWFk0VRQ5JZhFWExRaXAgRVnpDA1RWTUkBB10FHFYGQwZyHwBNFA1HBhIMC1IGAUsGBhkHBGcHBBUGdlMHQAd1UA8WBwwMB1NcBAhdahBWUAdRXjtfVwpEBiEdASBHREpdEVwUQRxuRVgRWg1UaVkHS1kcA3IWBXEVAFZMAz1VBVRWXT5KRQNKFQVEACMXASJFVlRBVzoTRQZVQQRVOUdHVUAVDRRXIBNXJxdYAwtWQFJeVgpHAiYTCwoEWhRmZ0MHCxwFJhNbUEcGU1tCHW1dAWwAGlEIEAFVXEYNIRQBIRcTSltIAUVHTTxdAghAU3VDDSFCVkYCXFE8XgIMQwF7QAAlFwVZEVJpTUdWBBcHU2cBXgEKRgFwFVdxR1tWQQtHQhhuUFgAAA5WE1oKH0JcBEZVDGdVBBdVQl4BVQgFVRIVEFwWBBdHS2xKBU1RDANvDFFfX0cNekZTcxJeE1gQW24XDw8RDEdTIUBTJhFTAxZXb1lUAVNRa1ZZAFxHAXQVUHxDVxdDUAxcFRVmVFpQBlRbFFNxEAwgRXcMXAdfFUZbBFQAXFQGV1YCAVI=>; rel="next"; expires=2023-06-01T17:13:19.699379618Z
在到期時間之前點選提供的連結,會回傳下一頁結果,並且後續頁面會顯示新的 Link
標頭。若要翻閱所有的回應頁面,只需繼續點選回傳的每個 Link
標頭,直到系統不再回傳其他 Link
標頭為止。這表示已抵達請求的資料集合結尾。
篩選請求的資料
上述的簡單範例說明了分頁顯示的運作原理,它同時也是一項非常龐大的搜尋作業。當旅宿數量眾多時,系統可能需要一些時間才能將所有旅宿分頁顯示。加入額外的查詢參數,有助於單純搜尋真正需要的旅宿。
例如,也許旅客只需要美國境內的旅宿,而不是要請求顯示所有的旅宿。只要使用 country_code
物件,將請求變更為加入查詢參數,便能請求此旅宿的子集合。
使用國家/地區參數的請求範例:
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US
此範例仍將提供與上面相同的分頁顯示功能,但需要進行翻閱的旅宿數量會變少。
減少請求的旅宿數量的另一種方法,是僅取得自上次提取旅宿資料後有變更過的旅宿。使用 date_updated_start
物件會僅回傳自特定日期後有變更過的旅宿。
使用國家/地區與日期參數的請求範例:
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US&date_updated_start=2023-01-02
要改善分頁顯示速度和減少傳輸的資料量,關鍵在於僅請求所需的旅宿。
切分搜尋以便平行計算
有時候,即使只請求所需的旅宿,回傳的結果筆數仍然相當龐大。在此情況下,同時執行多個搜尋作業有助於加快流程。
第一步是將所需的搜尋分解為範圍更小的搜尋。對於每個使用案例來說此範例做法各有不同,但可以從所需的搜尋開始,然後在該項搜尋上加入更多彼此不重複的查詢參數。
例如,如果所需搜尋是針對美國的所有旅宿,則可以先依國家/地區開始進行篩選,如以上範例所示。
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US
接著使用像是 property_rating_min
和 property_rating_max
物件,進一步切分此搜尋。
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US&property_rating_min=0.0&property_rating_max=0.9
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US&property_rating_min=1.0&property_rating_max=1.9
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US&property_rating_min=2.0&property_rating_max=2.9
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US&property_rating_min=3.0&property_rating_max=3.9
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US&property_rating_min=4.0&property_rating_max=4.9
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US&property_rating_min=5.0
假設現在有六個單獨的請求,全部都可以單獨 (或同時) 進行分頁顯示。結果會與正在擷取的資料集合相同,但回傳速度更快。
每種情況有所不同,但從所需的搜尋開始並查看回應第一頁上的 pagination-total-results
回應標頭,便可提供指標讓我們了解切分搜尋是否有助於提升分頁顯示的效率。
程式碼範例
上述資訊僅概述分頁顯示流程的概念,以及切分資料的方式,而以下顯示的 Java 程式碼則提供了較為具體的範例。
**注意:**以下程式碼範例不包含正確的異常處理和其他最佳做法。一如既往,在編寫可用於正式運作的程式碼時,仍應遵循所有最佳做法。
首先,可以使用簡單的 RapidClient
類別做為進行 Rapid 呼叫的基礎。
public class RapidClient {
// Base URL
private static final String RAPID_BASE_URL = "https://api.ean.com";
// Headers
private static final String GZIP = "gzip";
private static final String AUTHORIZATION_HEADER = "EAN APIKey={0},Signature={1},timestamp={2}";
// HTTP Client
private static final Client CLIENT = ClientBuilder.newClient().register(GZipEncoder.class);
private final String apiKey;
private final String sharedSecret;
public RapidClient(String apikey, String sharedSecret) {
this.apiKey = apikey;
this.sharedSecret = sharedSecret;
}
public Response get(String path, MultivaluedMap<String, String> queryParameters) {
WebTarget webTarget = CLIENT.target(RAPID_BASE_URL).path(path);
// Add all query parameters from the map to the web target
for (Map.Entry<String, List<String>> entry : queryParameters.entrySet()) {
for (String value : entry.getValue()) {
webTarget = webTarget.queryParam(entry.getKey(), value);
}
}
return webTarget.request(MediaType.APPLICATION_JSON_TYPE)
.header(HttpHeaders.ACCEPT_ENCODING, GZIP)
.header(HttpHeaders.AUTHORIZATION, generateAuthHeader())
.get();
}
private String generateAuthHeader() {
final String timeStampInSeconds = String.valueOf(ZonedDateTime.now(ZoneOffset.UTC).toEpochSecond());
final String input = apiKey + sharedSecret + timeStampInSeconds;
final String signature = DigestUtils.sha512Hex(input);
return MessageFormat.format(AUTHORIZATION_HEADER, apiKey, signature, timeStampInSeconds);
}
}
這些只是單純的樣板程式碼,可以讓您更輕鬆地讀取後續的類別。
下一個類別將表示特定的內容 API 呼叫,並將使用 RapidClient
進行該呼叫。
public class PropertyContentCall {
// Path
private static final String PROPERTY_CONTENT_PATH = "v3/properties/content";
// Headers
private static final String LINK = "Link";
private static final String PAGINATION_TOTAL_RESULTS = "Pagination-Total-Results";
// Query parameters keys
private static final String LANGUAGE = "language";
private static final String SUPPLY_SOURCE = "supply_source";
private static final String COUNTRY_CODE = "country_code";
private static final String CATEGORY_ID = "category_id";
private static final String TOKEN = "token";
private static final String INCLUDE = "include";
// Call parameters
private final RapidClient client;
private final String language;
private final String supplySource;
private final List<String> countryCodes;
private final List<String> categoryIds;
private String token;
public PropertyContentCall(RapidClient client, String language, String supplySource,
List<String> countryCodes, List<String> categoryIds) {
this.client = client;
this.language = language;
this.supplySource = supplySource;
this.countryCodes = countryCodes;
this.categoryIds = categoryIds;
}
public Stream<RapidPropertyContent> stream() {
return Stream.generate(() -> {
synchronized (this) {
// Make the call to Rapid.
final Response response = client.get(PROPERTY_CONTENT_PATH, queryParameters());
// Read the response to return.
final Map<String, RapidPropertyContent> propertyContents = response.readEntity(new GenericType<>() { });
// Store the token for pagination if we got one.
token = getTokenFromLink(response.getHeaderString(LINK));
return propertyContents;
}
})
.takeWhile(MapUtils::isNotEmpty)
.map(Map::values)
.flatMap(Collection::stream);
}
public Integer size() {
// Make the call to Rapid.
final MultivaluedMap<String, String> queryParameters = queryParameters();
queryParameters.putSingle(INCLUDE, "property_ids");
final Response response = client.get(PROPERTY_CONTENT_PATH, queryParameters);
// Read the size to return.
final Integer size = Integer.parseInt(response.getHeaderString(PAGINATION_TOTAL_RESULTS));
// Close the response since we're not reading it.
response.close();
return size;
}
private MultivaluedMap<String, String> queryParameters() {
final MultivaluedMap<String, String> queryParams = new MultivaluedHashMap<>();
if (token != null) {
queryParams.putSingle(TOKEN, token);
} else {
// Add required parameters
queryParams.putSingle(LANGUAGE, language);
queryParams.putSingle(SUPPLY_SOURCE, supplySource);
// Add optional parameters
if (CollectionUtils.isNotEmpty(countryCodes)) {
queryParams.put(COUNTRY_CODE, countryCodes);
}
if (CollectionUtils.isNotEmpty(categoryIds)) {
queryParams.put(CATEGORY_ID, categoryIds);
}
}
return queryParams;
}
private String getTokenFromLink(String linkHeader) {
if (StringUtils.isEmpty(linkHeader)) {
return null;
}
final int startOfToken = linkHeader.indexOf("=") + 1;
final int endOfToken = linkHeader.indexOf(">");
return linkHeader.substring(startOfToken, endOfToken);
}
}
PropertyContentCall
表示對 Rapid 內容 API 的單一請求,並透過完成呼叫來封裝分頁顯示的流程。
範例:
將下面的 API 呼叫與相同的 Java 請求進行比較。
https://api.ean.com/v3/properties/content?language=en-US&supply_source=expedia&country_code=US
PropertyContentCall request = new PropertyContentCall(myRapidClient, "en-US", "expedia", List.of("US"), null);
- 此處使用的
PropertyContentCall
與此範例具體相關。系統將依country_code
和category_id
細分呼叫 (雖然可能依使用案例而變更此項做法)。由於這是專為平行計算所編寫,因此該範例將使用 Java 平行資料流。公共stream()
方法的存在,是為了回傳RapidPropertyContent
物件資料流。RapidPropertyContent
物件只是單純的 POJO,表示來自 Rapid 內容 API 呼叫的單一旅宿。雖然這裡使用 Java 平行資料流,但任何平行執行程式碼的方式皆可適用。 - 當呼叫
stream()
的程式碼需要從資料流中讀取下一個旅宿時,如果已擷取到該旅宿,此方法便會提供,否則將呼叫 Rapid 內容 API 得出下一頁結果並從該頁結果中回傳一個旅宿。單純呼叫stream()
並將其讀取完畢,即可處理透過請求回傳的每個旅宿的分頁顯示作業。 - 而另一個公共助手方法
size()
則可讓您輕鬆地查看此PropertyContentCall
會回傳的旅宿總數。此方法有助於判斷呼叫規模是否已經夠小,或是需要進一步切分為規模較小的呼叫以便進行平行計算。
在您呼叫 Rapid 並透過回應進行分頁顯示時,上述建構區塊已為您奠定了基礎。以下程式碼利用上述類別將呼叫自動切分為容易管理的片段、並透過平行進行規模較小的呼叫來進行分頁,然後將合併輸出內容寫入一個檔案。
public class ParallelFileMaker {
private static final String APIKEY = System.getenv().get("RAPID_APIKEY");
private static final String SHARED_SECRET = System.getenv().get("RAPID_SHARED_SECRET");
private static final List<String> COUNTRIES = Arrays.asList("AD", "AE", "AF", "AG", "AI", "AL", "AM", "AO", "AQ",
"AR", "AS", "AT", "AU", "AW", "AX", "AZ", "BA", "BB", "BD", "BE", "BF", "BG", "BH", "BI", "BJ", "BL", "BM",
"BN", "BO", "BQ", "BR", "BS", "BT", "BV", "BW", "BY", "BZ", "CA", "CC", "CD", "CF", "CG", "CH", "CI", "CK",
"CL", "CM", "CN", "CO", "CR", "CU", "CV", "CW", "CX", "CY", "CZ", "DE", "DJ", "DK", "DM", "DO", "DZ", "EC",
"EE", "EG", "EH", "ER", "ES", "ET", "FI", "FJ", "FK", "FM", "FO", "FR", "GA", "GB", "GD", "GE", "GF", "GG",
"GH", "GI", "GL", "GM", "GN", "GP", "GQ", "GR", "GS", "GT", "GU", "GW", "GY", "HK", "HM", "HN", "HR", "HT",
"HU", "ID", "IE", "IL", "IM", "IN", "IO", "IQ", "IR", "IS", "IT", "JE", "JM", "JO", "JP", "KE", "KG", "KH",
"KI", "KM", "KN", "KP", "KR", "KW", "KY", "KZ", "LA", "LB", "LC", "LI", "LK", "LR", "LS", "LT", "LU", "LV",
"LY", "MA", "MC", "MD", "ME", "MF", "MG", "MH", "MK", "ML", "MM", "MN", "MO", "MP", "MQ", "MR", "MS", "MT",
"MU", "MV", "MW", "MX", "MY", "MZ", "NA", "NC", "NE", "NF", "NG", "NI", "NL", "NO", "NP", "NR", "NU", "NZ",
"OM", "PA", "PE", "PF", "PG", "PH", "PK", "PL", "PM", "PN", "PR", "PS", "PT", "PW", "PY", "QA", "RE", "RO",
"RS", "RU", "RW", "SA", "SB", "SC", "SD", "SE", "SG", "SH", "SI", "SJ", "SK", "SL", "SM", "SN", "SO", "SR",
"SS", "ST", "SV", "SX", "SY", "SZ", "TC", "TD", "TF", "TG", "TH", "TJ", "TK", "TL", "TM", "TN", "TO", "TR",
"TT", "TV", "TW", "TZ", "UA", "UG", "UM", "US", "UY", "UZ", "VA", "VC", "VE", "VG", "VI", "VN", "VU", "WF",
"WS", "YE", "YT", "ZA", "ZM", "ZW");
private static final List<String> PROPERTY_CATEGORIES = Arrays.asList("0", "1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "20", "21", "22", "23", "24", "25", "26",
"29", "30", "31", "32", "33", "34", "36", "37", "39", "40", "41", "42", "43", "44");
private static final int MAX_CALL_SIZE = 20_000;
private static final String LANGUAGE = "en-US";
private static final String SUPPLY_SOURCE = "expedia";
private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper()
.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
.registerModule(new JavaTimeModule());
private static final RapidClient RAPID_CLIENT = new RapidClient(APIKEY, SHARED_SECRET);
public void run() throws IOException {
final Map<PropertyContentCall, Integer> allCalls = divideUpCalls();
// Make sure we're making the calls in the most efficient order. This list will be smallest to largest, so
// that when the streams get combined and are reversed, the largest stream will be first.
final List<Stream<RapidPropertyContent>> callsToMake = allCalls.entrySet().stream()
.filter(entry -> entry.getValue() > 0) // filter out any calls that don't have results
.sorted(Map.Entry.comparingByValue()) // sort all the calls with the smallest calls first
.map(Map.Entry::getKey) // just need the call itself now
.map(PropertyContentCall::stream) // get the stream for each call
.toList();
// Combine all the streams into one big stream and actually make the calls and write to the file.
try (Stream<RapidPropertyContent> bigStream = combineStreams(callsToMake);
BufferedWriter outputFileWriter = createFileWriter(Path.of("output.jsonl.gz"))) {
bigStream.parallel()
.forEach(property -> {
try {
// Write to output file
synchronized (outputFileWriter) {
outputFileWriter.append(OBJECT_MAPPER.writeValueAsString(property));
outputFileWriter.newLine();
}
} catch (Exception e) {
// Handle exception
}
});
}
}
/**
* This will split up the calls to be made based on the size of each call's results. It will first split into
* calls per country and, if needed, it will then further split into calls per category for any country that is
* too big on its own.
* The size of each call is also kept so that the calls can be further sorted if needed.
*
* @return A map containing all the calls and their respective sizes.
*/
private Map<PropertyContentCall, Integer> divideUpCalls() {
final Map<PropertyContentCall, Integer> allCalls = new HashMap<>();
COUNTRIES.stream().parallel()
.forEach(countryCode -> {
// Check to see if the entire country is small enough to get at once.
final PropertyContentCall countryCall = new PropertyContentCall(RAPID_CLIENT, LANGUAGE,
SUPPLY_SOURCE, List.of(countryCode), null);
final Integer countryCallSize = countryCall.size();
if (countryCallSize < MAX_CALL_SIZE) {
// It's small enough! No need to break this call up further.
allCalls.put(countryCall, countryCallSize);
} else {
// The country is too big, need to break up the call into smaller parts.
PROPERTY_CATEGORIES.stream().parallel()
.forEach(category -> {
final PropertyContentCall categoryCall = new PropertyContentCall(RAPID_CLIENT,
LANGUAGE, SUPPLY_SOURCE, List.of(countryCode), List.of(category));
allCalls.put(categoryCall, categoryCall.size());
});
}
});
return allCalls;
}
/**
* This will combine multiple Streams into a single Stream. Because of how this is reduced, the Streams will end
* up in the reverse order of the list that was passed in.
* <p>
* Note: Because this is concatenating multiple Streams together, each Stream will go on the stack. Thus, if
* there are many Streams then a StackOverflowException can occur when trying to use the combined Stream. Make
* sure the stack size is appropriate for your usage via the `-Xss` JVM parameter.
*
* @param streams A list of the Streams to combine.
* @return The combined Stream that can be treated as one.
*/
private <T> Stream<T> combineStreams(List<Stream<T>> streams) {
return streams.stream()
.filter(Objects::nonNull)
.reduce(Stream::concat)
.orElse(Stream.empty());
}
private BufferedWriter createFileWriter(Path path) throws IOException {
return new BufferedWriter(
new OutputStreamWriter(
new GZIPOutputStream(
Files.newOutputStream(path)),
StandardCharsets.UTF_8));
}
}
雖然上面的程式碼內含許多解釋各個片段的內嵌評論,但您可以用以下方式加以總結:
- 依據使用案例將主呼叫分為規模較小的呼叫。(在此範例中,主呼叫係用來擷取所有結果,而細分作業則是依據
country_code
進行,並在需要時依據category_id
進行)。 - 與此範例具體相關的方式是合併平行資料流,排序呼叫以提高運作效率。
- 然後平行執行這些呼叫,並將這些呼叫所回傳的旅宿寫入一個檔案。