Snapshot Extraction#

Module that handles all Factiva Analytics Extraction requests and objects. Contains classes and tools that allow to run extraction jobs and download the generated files.

SnapshotExtraction#

class factiva.analytics.snapshots.extraction.SnapshotExtraction(job_id=None, query=None, user_key=None)#

Main class to interact with the Extractions service from Factiva Analytics.

user_key#

User representation for service authentication

Type:

UserKey

query#

Query object tailored for Extraction operations

Type:

SnapshotExtractionQuery

job_response#

Object containing job status and execution details

Type:

SnapshotExtractionJobReponse

download_files(path=None)#

Download all files from a job and stores them in the given path.

If the path parameter is empty, files are stored in a folder with the name of the job short id.

Parameters:

path (str, Optional) – String containing the path where to store the downloaded files. If not provided, the files are stored in a folder named after the job short_id. If such folder does not exists, it is created in the current working directory.

Returns:

True if files were correctly downloaded, False if no files are available for download or the download failed.

Return type:

bool

get_job_response() bool#

Performs a request to the API to obtain an updated status of a job execution.

If the job has been completed, result details are assigned to the job_response object.

Returns:

True if the get request was successful. An Exception otherwise.

Return type:

bool

Raises:

ValueError – If the Job ID doesn’t exist for the user key, or the get request is invalid.

process_job(path=None)#

Submit a new job to be processed, wait until the job is completed and then retrieves the job results.

Returns:

True if the extraction processing was successful. An Exception otherwise.

Return type:

bool

submit_job()#

Performs a POST request to the API using the assigned values in user_key and query.

If the job is initiated succesfully, the initial status is stored in the job_response object. Otherwise any HTTP error will raise an exception.

Returns:

True if the submission was successful. An Exception otherwise.

Return type:

bool

Raises:

ValueError – When the query is empty or invalid.

SnapshotExtractionQuery#

class factiva.analytics.snapshots.extraction.SnapshotExtractionQuery(where: str | None = None, includes: dict | None = None, include_lists: dict | None = None, excludes: dict | None = None, exclude_lists: dict | None = None, file_format: str = 'avro', limit: int = 0)#

Query class used specifically for Snapshot Extraction operations.

where#

User representation for service authentication

Type:

str

includes#

Dictionary with a fixed list of codes to include

Type:

dict

includes_list#

Dictionary with references to Lists for inclusion

Type:

dict

excludes#

Dictionary with a fixed list of codes to exclude

Type:

dict

excludes_list#

Dictionary with references to Lists for inclusion

Type:

dict

file_format#

Chosen file fomat for extraction files

Type:

str

limit#

Max number of articles to extract

Type:

int

get_payload() dict#

Create the basic request payload to be used within a Snapshots Extraction API request.

Returns:

Dictionary containing non-null query attributes.

Return type:

dict

SnapshotExtractionJobReponse#

class factiva.analytics.snapshots.extraction.SnapshotExtractionJobReponse(job_id: str | None = None, user_key: UserKey | None = None)#

Snapshot Explain Job Response class. Essentially contains the volume of estimate documents.

job_id#

Job ID returned by Factiva Analyitcs at submission time

Type:

str

short_id#

Unique portion from the attribute job_id

Type:

str

Job unique URI

Type:

str

job_state#

Job status value

Type:

str

errors#

If not empty, a list of errors during the job execution

Type:

list

files#

If the job is successful, this shows the list of files that can be downloaded with the selected content.

Type:

list