Using NumPy (Numerical Python) and the handling of NULLS within a Vector User Defined Function (UDF).

Using NumPy (Numerical Python) and the handling of NULLS within a Vector User Defined Function (UDF).

During this video we provide a demonstration of how to make use of Python's NumPy functionality and how to handle NULL values within a Vector User Defined Function (UDF).

rate limit

Code not recognized.

About this course

At the end of this course you will:

Have an understanding of how to implement a Vector User Defined Function (UDF) using the Python NumPy functionality and best practice for handling NULL values within a UDF.

  • NumPy’s key strength is processing vectors of numeric data. Although other data types such as strings and dates are supported in NumPy UDFs, many of these types revert to the underlying Python data type rather than a NumPy array.  To realize the benefit of NumPy array processing, data must be accessed with NumPy APIs. Converting to Python objects will negate any performance benefit and may even be slower.  UDFs will receive and return vectors rather than a tuple at a time.
  • By default, UDFs follow conventional SQL rules when NULL arguments are encountered. If any UDF argument is NULL, the UDF is not invoked, and the result of the UDF is assumed to be NULL. If you need to write a UDF that either returns NULLs or needs to process NULL argument values, you must add “nullskip=’ignore’” to the UDF definition. When set, UDFs are expected to process NULL arguments and can also return NULL values. NULLS are indicated by the NULL type in JavaScript and None in Python. Supporting NULLs in a UDF adds overhead to UDF processing and should only be enabled when necessary.
Course Style:
The content is presented in a how to style in order for you to follow along within your own database environment
Audience:
For Data Scientists, Developers, System or Database Administrators who have responsibility for maintaining Corporate Databases and Data Warehouses.
Prerequisites:
  • Database Administrator role/privilege 
  • Data Scientist / Developer with a sound understanding of how to best use the NumPy functionality 
  • Using minimum database software versions : Actian X 11.2 and Actian Vector 6.2
Resource Links:
Software Download: https://esd.actian.com/
Actian Community:  https://communities.actian.com
Documentation:  https://docs.actian.com/

Curriculum11 min

  • NumPy and NULL handling 10 min
  • Feedback
  • Take Course Survey 1 min

About this course

At the end of this course you will:

Have an understanding of how to implement a Vector User Defined Function (UDF) using the Python NumPy functionality and best practice for handling NULL values within a UDF.

  • NumPy’s key strength is processing vectors of numeric data. Although other data types such as strings and dates are supported in NumPy UDFs, many of these types revert to the underlying Python data type rather than a NumPy array.  To realize the benefit of NumPy array processing, data must be accessed with NumPy APIs. Converting to Python objects will negate any performance benefit and may even be slower.  UDFs will receive and return vectors rather than a tuple at a time.
  • By default, UDFs follow conventional SQL rules when NULL arguments are encountered. If any UDF argument is NULL, the UDF is not invoked, and the result of the UDF is assumed to be NULL. If you need to write a UDF that either returns NULLs or needs to process NULL argument values, you must add “nullskip=’ignore’” to the UDF definition. When set, UDFs are expected to process NULL arguments and can also return NULL values. NULLS are indicated by the NULL type in JavaScript and None in Python. Supporting NULLs in a UDF adds overhead to UDF processing and should only be enabled when necessary.
Course Style:
The content is presented in a how to style in order for you to follow along within your own database environment
Audience:
For Data Scientists, Developers, System or Database Administrators who have responsibility for maintaining Corporate Databases and Data Warehouses.
Prerequisites:
  • Database Administrator role/privilege 
  • Data Scientist / Developer with a sound understanding of how to best use the NumPy functionality 
  • Using minimum database software versions : Actian X 11.2 and Actian Vector 6.2
Resource Links:
Software Download: https://esd.actian.com/
Actian Community:  https://communities.actian.com
Documentation:  https://docs.actian.com/

Curriculum11 min

  • NumPy and NULL handling 10 min
  • Feedback
  • Take Course Survey 1 min