Posted On: Sep 11, 2015

You can now create and run scalar user-defined functions (UDFs) in Amazon Redshift. With scalar UDFs, you can perform analytics that were previously impossible or too complex for plain SQL.

Using PostgreSQL syntax, you can create custom scalar functions in Python 2.7 and execute them in parallel across your clusters. Once defined, you can use scalar UDFs in any SQL statement, just as you would use our built-in scalar functions. To learn more about creating and using scalar UDFs, see the scalar UDF documentation.

Also, you can easily take advantage of thousands of functions available through Python libraries. Amazon Redshift UDFs come integrated with most of the Python Standard Library as well as numPy, scipy, pandas, dateutil, six, wsgrief, and pytz analytic libraries. Simply import the relevant Python libraries into scalar UDFs and use them in your SQL.

To deliver the highest level of security, we run UDFs inside a restricted container that cannot access the network or make system calls. For fast performance, we run UDFs in parallel on each node of your cluster. For further details, see our blog post.

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. For less than $1,000/TB/YR, you can focus on your analytics, while Amazon Redshift manages the infrastructure for you. Get started with a free 2-month trial.