Managing Product Page Sessions

The ProductPage class acts as a high-level client for interacting with a product page. There are several ways to create and manage ProductPage instances, depending on the use case.

Creating Product Pages

Instantiating ProductPage does not automatically establish a client session. The session is started and closed explicitly using the start_session and close_session methods, respectively. The page data is fetched using the update method.

import asyncio
from freshpointsync import ProductPage

LOCATION_ID = 296

async def main() -> None:
    page = ProductPage(location_id=LOCATION_ID)
    try:
        await page.start_session()
        print(f'Fetching data for location ID {page.data.location_id}...')
        await page.update()
        print(f'Location Name: {page.data.location}')
    finally:
        await page.close_session()

if __name__ == '__main__':
    asyncio.run(main())

In the example above, a new ProductPage instance is created with the location ID 296. The session is started, the page data is fetched, and the location name is printed. Lastly, the session is closed in the finally block to ensure the disconnect.

Note

If you specify a non-existent location ID, the page data will remain empty after the update.

Leveraging the Asynchronous Context Manager

ProductPage implements the asynchronous context manager protocol. This means that you can use the async with statement to manage the session lifecycle automatically, without the need to call start_session and close_session explicitly or use the try-finally block.

import asyncio
from freshpointsync import ProductPage

LOCATION_ID = 296

async def main() -> None:
    async with ProductPage(location_id=LOCATION_ID) as page:
        print(f'Fetching data for location ID {page.data.location_id}...')
        await page.update()
        print(f'Location Name: {page.data.location}')

if __name__ == '__main__':
    asyncio.run(main())

The example above is equivalent to the previous one. The ProductPage instance is created, the session is started, the page data is fetched, and the location name is printed. The session is closed automatically when the context manager exits.

Tip

Opting for manual session management or using the context manager depends on the use case and the desired level of control over the object lifecycle.

While the async with statement is more concise and convenient for simple one-time use cases, manual session management may be suitable in more complex scenarios, such as when you need to reuse the object multiple times or store it as a class attribute.

Serializing Page Data

The product page data is represented by a ProductPageData object, which is a Pydantic model. It is stored in the data attribute of the ProductPage instance. The page data is empty upon initialization and is updated when an update method is called. However, you can provide the initial data for this attribute when creating a new page instance by passing a ProductPageData object to the data parameter of the ProductPage constructor. It is also possible to serialize the data and store it between application sessions.

Note

Pydantic models allow to include and exclude fields from serialization by providing the include and exclude parameters to the model_dump and model_dump_json methods. By default, all fields are included.

But what if you want to include or exclude specific fields of a nested model? For example, The products field of the ProductPageData model is a dictionary of product IDs and Product models. If you want to exclude the info and pic_url fields of every Product model in that dictionary, you can use the following syntax:

data = page.data.model_dump(
    exclude={'products': {'__all__': {'info',  'pic_url'}}}
)

Let’s implement a script that periodically fetches the page data and prints if the page has changed since the last update to demonstrate the serialization and deserialization of the page data.

import asyncio
from pathlib import Path
from freshpointsync import ProductPage, ProductPageData

LOCATION_ID = 296
CACHE_FILENAME = f'pageData_{LOCATION_ID}.json'

def load_from_file(file_path: str) -> ProductPageData:
    print(f"Loading data from cache file '{file_path}'...")
    with open(file_path, 'r', encoding='utf-8') as f:
        return ProductPageData.model_validate_json(f.read())

def dump_to_file(data: ProductPageData, file_path: str) -> None:
    print(f"Dumping data to cache file '{file_path}'...")
    with open(file_path, 'w', encoding='utf-8') as f:
        f.write(data.model_dump_json(indent=4, by_alias=True))

async def main() -> None:
    cache_file = Path(CACHE_FILENAME)
    if cache_file.exists():
        data = load_from_file(CACHE_FILENAME)
        async with ProductPage(data=data) as page:
            print(f'Updating data for location ID {page.data.location_id}...')
            await page.update()
            if page.data.html_hash != data.html_hash:
                print('Product page has changed since the last update.')
            else:
                print('Product page has not changed since the last update.')
            dump_to_file(page.data, CACHE_FILENAME)
    else:
        async with ProductPage(location_id=LOCATION_ID) as page:
            print(f'Fetching data for location ID {page.data.location_id}...')
            await page.update()
            dump_to_file(page.data, CACHE_FILENAME)
        print('[tip] Run the script again to check for updates.')

if __name__ == '__main__':
    asyncio.run(main())

In the example above, a ProductPageData object for location ID 296 is created from the serialized data stored in a cache file pageData_296.json. A new ProductPage instance is created with this data. The page data is then updated, and the script prints whether the page products have changed since the last update based on the value of MD5 hash of the page HTML contents. Finally, the updated data is serialized and stored back to the file.

Tip

It is possible to create an empty ProductPageData object. The only required field is the location_id. Instantiating a ProductPage with this object would be equivalent to directly passing the location ID to the constructor.