airflow_indexima.operators.indexima

Indexima operators module definition.

IndeximaHookBasedOperator

IndeximaHookBasedOperator(self, task_id:str, indexima_conn_id:str, connection_decorator:Union[Callable[[airflow.models.Connection], airflow.models.Connection], NoneType]=None, dry_run:Union[bool, NoneType]=False, auth:Union[str, NoneType]=None, kerberos_service_name:Union[str, NoneType]=None, timeout_seconds:Union[int, datetime.timedelta, NoneType]=None, socket_keepalive:Union[bool, NoneType]=None, *args, **kwargs)
Our base class for indexima operator.

This class act as a wrapper on IndeximaHook.

ui_color

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

get_hook

IndeximaHookBasedOperator.get_hook(self) -> airflow_indexima.hooks.indexima.IndeximaHook
Return a configured IndeximaHook instance.

IndeximaQueryRunnerOperator

IndeximaQueryRunnerOperator(self, task_id:str, sql_query:str, indexima_conn_id:str, connection_decorator:Union[Callable[[airflow.models.Connection], airflow.models.Connection], NoneType]=None, dry_run:Union[bool, NoneType]=False, auth:Union[str, NoneType]=None, kerberos_service_name:Union[str, NoneType]=None, timeout_seconds:Union[int, datetime.timedelta, NoneType]=None, socket_keepalive:Union[bool, NoneType]=None, *args, **kwargs)
A simple query executor.

template_fields

tuple() -> empty tuple tuple(iterable) -> tuple initialized from iterable's items

If the argument is a tuple, the return value is the same object.

execute

IndeximaQueryRunnerOperator.execute(self, context)
Execute sql query.

Parameters

  • context: dag context

IndeximaLoadDataOperator

IndeximaLoadDataOperator(self, task_id:str, indexima_conn_id:str, target_table:str, load_path_uri:str, truncate:bool=False, truncate_sql:Union[str, NoneType]=None, connection_decorator:Union[Callable[[airflow.models.Connection], airflow.models.Connection], NoneType]=None, source_select_query:Union[str, NoneType]=None, format_query:Union[str, NoneType]=None, prefix_query:Union[str, NoneType]=None, skip_lines:Union[int, NoneType]=None, no_check:Union[bool, NoneType]=False, limit:Union[int, NoneType]=None, locale:Union[str, NoneType]=None, dry_run:Union[bool, NoneType]=False, auth:Union[str, NoneType]=None, kerberos_service_name:Union[str, NoneType]=None, timeout_seconds:Union[int, datetime.timedelta, NoneType]=None, socket_keepalive:Union[bool, NoneType]=None, pause_delay_in_seconds_between_query:Union[int, NoneType]=None, *args, **kwargs)
Indexima load data operator.

Operations:

1. truncate target_table (false per default)
2. load source_select_query into target_table using redshift_user_name credential
4. commit/rollback target_table

All fields ('target_table', 'load_path_uri', 'source_select_query', 'truncate_sql', 'format_query', 'prefix_query', 'skip_lines', 'no_check', 'limit', 'locale', 'pause_delay_in_seconds_between_query' ) support airflow macro.

Syntax (see https://indexima.com/support/doc/v.1.7/Load_Data/Load_Data_Inpath.html)

LOAD DATA INPATH 'path_of_the_data_source'
INTO TABLE my_data_space
[FORMAT 'separator' / ORC / PARQUET / JSON];
[PREFIX 'value1 \t value2 \t ... \t']
[QUERY "my_SQL_Query"]
[SKIP lines]
[NOCHECK]
[LIMIT n_lines]
[LOCALE 'FR']

template_fields

tuple() -> empty tuple tuple(iterable) -> tuple initialized from iterable's items

If the argument is a tuple, the return value is the same object.

generate_load_data_query

IndeximaLoadDataOperator.generate_load_data_query(self) -> str
Generate 'load data' sql query.

Returns

(str): load data sql query

execute

IndeximaLoadDataOperator.execute(self, context)
Process executor.