Front-End Web & Mobile
Amplify CLI announces new GraphQL transform feature for orchestrating multiple AI/ML use cases
The Amplify Framework is an open source project for building cloud-enabled mobile and web applications.
The launch of the Predictions category in the Amplify Framework, a few months ago, enabled developers to easily add AI/ML use cases to their web applications. Use cases such as translate text from one language to another, generated speech from text, and others can be achieved using the Predictions category with few lines of code. No machine learning experience is required. As we see exciting use cases being built with this feature we see a pattern where multiple actions are being combined to achieve a more powerful scenario. For example, you want to translate text from one language to another and then have the translated text be spoken in a target language. This can be achieved using the Predictions library today by making call to individual actions separately.
Today, we are happy to share that Amplify now makes it simple to orchestrate multiple AI/ML chained actions by using a @predictions directive on the query field of your GraphQL schema. The Amplify CLI sets up the backend, policies, and configuration needed for all actions in the predictions directive without you needing to configure each one. You can then use the Amplify API category library to invoke a single GraphQL query operation to get the result of chained inference calls. This simplifies your code and reduces the number of steps needed to achieve orchestration of multiple chained actions both in your frontend and backend.
In this blog, we use the @predictions directive and the Amplify API library to build a react app that can first identify text from an image (English), then translate the text to another language (Spanish), and finally convert the translated text to speech (Spanish) with one simple directive on the query field of your GraphQL schema. The directive will set up a GraphQL API endpoint with HTTP data source endpoints for Amazon AI/ML services corresponding to individual actions such as Amazon Rekognition for identifying text, Amazon Translate for translating text, and finally Amazon Polly for generating speech. In addition, it sets up the IAM policies for each service and AppSync VTL function.
The sequence of actions supported today are:
- IdentifyText –> TranslateText –> ConvertTextToSpeech
- IdentifyLabels –> TranslateText –> ConvertTextToSpeech
- TranslateText –> ConvertTextToSpeech
In addition, these actions can be called individually.
Here is how a sample flow for actions IdentityText followed by TranslateText looks at a high level:
The application UI we are building looks like below:
The upload action in both cases will store the images in S3 bucket that is provisioned from the Amplify CLI.
Prerequisites
Install Node.js and npm if they are not already installed on your machine.
Note: At the time of writing this blog the minimum version of Node.js required is >= 8.x and for npm >= 5.x
Install the and configure the Amplify CLI.
The configure step will guide you through steps for creating a new IAM user. Select all default options. If you already have the CLI configured, you do not need to run the configure command again.
Initialize the project
For this blog, let’s say you have a React application. If you do not have one you can create one using the following command:
From the root of your project folder run the command and accept defaults where applicable as shown:
Using the @predictions directive in your GraphQL schema
The predictions directive is added to the query field in your GraphQL schema. The Amplify CLI provisions the backend resources for the actions mentioned in the directive. In our sample, we want to identity text in an image, translate it to another language, and then convert that the translated text to speech.
First, let us add an API to our backend using the amplify add api
command. This will create a GraphQL endpoint which will communicate with the HTTP endpoints for services corresponding to each action mentioned in our directive.
Edit your schema.graphql file to add the predictions directive as shown below.
As you can see, the actions can be added in the directive and they will be executed in the order you mention them.
Once you have updated your schema file, run the gql-compile command to make sure it is correct:
You can also invoke single actions using the predictions directive. For e.g. if you wanted to do only text translate you can add the following line in the GraphQL schema under query.
Add Storage
We add storage to store the images that we use to identify text from. Run the following command from the terminal:
Push your changes to the cloud
The push command will provision the backend in the cloud.
Making query calls from your client application
First, we install the dependencies by running the following commands from the root of your application folder.
Replace the code in your src/App.js:
Let’s take a look at the important pieces here:
As you can see, we are calling an API GraphQL query using the Amplify API library. We pass in an input object to the API call which contains:
- The translate text action which specifies source and target language which in this case is English and Spanish respectively.
- The identify text action which passes in the key of the S3 bucket which contains the image to be used to identify text.
- The convert to speech action which contains the voice ID of the speaker “Mia”. The list of supported voice in different languages can be found here.
Save the file. Start the app from the terminal by running the command:
This will open up the application:
Next we choose a picture that has text in it and upload it for the first use case:
The image is uploaded to S3 bucket that was provisioned by the CLI. Thereafter, a call to the HTTP endpoint for Amazon Rekognition through the GraphQL endpoint is made to identify text in the image. Next, a similar HTTP call is made to Amazon Translate to translate the call to Spanish. Finally, a call is made to Amazon Polly to convert the translated text to speech.
When you click on the play button below the image, it plays the audio for translated text in Spanish saying “bienvenida”.
Next, let’s test the use case of speaking labels in an image.
Upload the following image in the “Speaks Label” section:
When you click play, it identifies the following labels in the image and plays the audio in the voice of the selected speaker: “outdoor”, “nature”, “water” “boat”, “vehicle”, ”transportation”, “rowboats”, “mountains”, “scenery”, and “landscape”.
Conclusion
We were able to use the predictions directive on the query fields of a GraphQL schema and invoke multiple chained AI/ML actions with a single API query call. The Amplify CLI provisioned the GraphQL API, set up HTTP endpoint as data source, IAM policies to interact with the endpoint, and AppSync VTL resolvers needed for individual actions with respective services.
Feedback
We hope you like these new features! Let us know how we are doing, and submit any feedback in the Amplify Framework Github Repository. You can read more about this feature on the Amplify Framework website.