Snapshot Extraction¶

Module that handles all Factiva Analytics Extraction requests and objects. Contains classes and tools that allow to run extraction jobs and download the generated files.

SnapshotExtraction¶

class factiva.analytics.snapshots.extraction.SnapshotExtraction(job_id=None, query=None, user_key=None)¶

Main class to interact with the Extractions service from Factiva Analytics.

user_key¶

User representation for service authentication

Type:: UserKey

query¶

Query object tailored for Extraction operations

Type:: SnapshotExtractionQuery

job_response¶

Object containing job status and execution details

Type:: SnapshotExtractionJobReponse

download_files(path=None)¶

Download all files from a job and stores them in the given path.

If the path parameter is empty, files are stored in a folder with the name of the job short id.

Parameters:: path (str, Optional) – String containing the path where to store the downloaded files. If not provided, the files are stored in a folder named after the job short_id. If such folder does not exists, it is created in the current working directory.
Returns:: True if files were correctly downloaded, False if no files are available for download or the download failed.
Return type:: bool

get_job_response() → bool¶

Performs a request to the API to obtain an updated status of a job execution.

If the job has been completed, result details are assigned to the job_response object.

Returns:: True if the get request was successful. False for FAILED jobs and an Exception for unexpected HTTP codes.
Return type:: bool
Raises:: ValueError – If the Job ID doesn’t exist for the user key, or the get request is invalid.

process_job(path=None)¶

Submit a new job to be processed, wait until the job is completed and then retrieves the job results.

Returns:: True if the extraction processing was successful. False if the job execution failed. An Exception otherwise.
Return type:: bool

submit_job()¶

Performs a POST request to the API using the assigned values in user_key and query.

If the job is initiated succesfully, the initial status is stored in the job_response object. Otherwise any HTTP error will raise an exception.

Returns:: True if the submission was successful. An Exception otherwise.
Return type:: bool
Raises:: ValueError – When the query is empty or invalid.

SnapshotExtractionQuery¶

class factiva.analytics.snapshots.extraction.SnapshotExtractionQuery(where: str = None, includes: dict = None, include_lists: dict = None, excludes: dict = None, exclude_lists: dict = None, file_format: str = 'avro', limit: int = 0, shards: int = 25)¶

Query class used specifically for Snapshot Extraction operations.

where¶

User representation for service authentication

Type:: str

includes¶

Dictionary with a fixed list of codes to include

Type:: dict

includes_list¶

Dictionary with references to Lists for inclusion

Type:: dict

excludes¶

Dictionary with a fixed list of codes to exclude

Type:: dict

excludes_list¶

Dictionary with references to Lists for inclusion

Type:: dict

file_format¶

Chosen file fomat for extraction files

Type:: str

limit¶

Max number of articles to extract

Type:: int

get_payload() → dict¶

Create the basic request payload to be used within a Snapshots Extraction API request.

Returns:: Dictionary containing non-null query attributes.
Return type:: dict

SnapshotExtractionJobReponse¶

class factiva.analytics.snapshots.extraction.SnapshotExtractionJobReponse(job_id: str = None, user_key: UserKey = None)¶

Snapshot Explain Job Response class. Essentially contains the volume of estimate documents.

job_id¶

Job ID returned by Factiva Analyitcs at submission time

Type:: str

short_id¶

Unique portion from the attribute job_id

Type:: str

job_link¶

Job unique URI

Type:: str

job_state¶

Job status value

Type:: str

errors¶

If not empty, a list of errors during the job execution

Type:: list

files¶

If the job is successful, this shows the list of files that can be downloaded with the selected content.

Type:: list