Artificial Intelligence

Jeremy Curuksu

Author: Jeremy Curuksu

Reinforcement learning from human feedback (RLHF) vs. AI feedback (RLAIF)

Fine-tune large language models with reinforcement learning from human or AI feedback

In this post, we introduce a state-of-the-art method to fine-tune LLMs by reinforcement learning, reviewed the pros and cons of RLHF vs. RLAIF vs. DPO, and saw how to scale LLM fine-tuning efforts with RLAIF. We also see how to implement an end-to-end RLAIF pipeline on SageMaker using the Hugging Face Transformer and TRL libraries, and using either off-the-shelf toxicity reward models to align responses during PPO or by directly prompting an LLM to generate quantitative reward feedback during PPO.

Automating financial decision making with deep reinforcement learning

Machine learning (ML) is routinely used in every sector to make predictions. But beyond simple predictions, making decisions is more complicated because non-optimal short-term decisions are sometimes preferred or even necessary to enable long-term, strategic goals. Optimizing policies to make sequential decisions toward a long-term objective can be learned using a family of ML models […]

Developing a business strategy by combining machine learning with sensitivity analysis

Machine learning (ML) is routinely used by countless businesses to assist with decision making. In most cases, however, the predictions and business decisions made by ML systems still require the intuition of human users to make judgment calls. In this post, I show how to combine ML with sensitivity analysis to develop a data-driven business […]

Forecasting financial time series with dynamic deep learning on AWS

In this post, I will show you how to develop an original RNN (Recurrent Neural Network) deep learning algorithm to forecast time series based on the past trends of multiple factors, taking advantage of Amazon SageMaker (using Bring-Your-Own-Algorithm). Amazon SageMaker is a fully-managed machine learning platform that enables data scientists and developers to quickly and easily build and train machine learning models into production applications, at scale. It enables you to use both built-in algorithms, built-in frameworks, and also import custom code via Docker containers.