emarket_data_explorer.data_process
This module provides the Shopee Data process functionality.
Todo:
1. when do parallel in v1.4, this aggregation shouldn’t work so need a new way to aggregate later
- class emarket_data_explorer.data_process.ShopeeAsyncCrawlerDataProcesser(data_source: str)
This class provides the data process functionality for async version
- aggregate_product_data(product_items_container: DataFrame, product_items: DataFrame) DataFrame
it accumulates three items for a page and then aggregate into a df
- extract_product_data(product: Dict[str, Any]) None
extract ‘description’, ‘models’ and ‘hashtag_list’ and append them into lists for later pd concatenation
- parse_good_comments(text: str) List[Dict[str, Any]]
parse the scarped comment data by transferring to json
- parse_good_info(text: str) List[Dict[str, Any]]
parse the scarped index data by transferring to json
- parse_search_indexs(text: str) List[Dict[str, Any]]
parse the scarped index data by transferring to json