In this post, we demonstrate how to build a simple web-based search application using the recently announcedย Amazonย ย OpenSearch Serverless, a serverless option forย Amazon OpenSearch Serviceย that makes it easy to run petabyte-scale search and analytics workloads without having to think about clusters. The benefit of using OpenSearch Serverless as a backend for your search application is that it automatically provisions and scales the underlying resources based on the search traffic demands, so you donโt have to worry about infrastructure management. You can simply focus on building your search application and analyzing the results. OpenSearch Serverless is powered by the open-sourceย OpenSearchย project, which consists of a search engine, and OpenSearch Dashboards, a visualization tool to analyze your search results.
Solution overview
There are many ways to build a search application.ย In our example, we create a simple Java script front end and callย Amazon API Gateway, which triggers anย AWS Lambdaย function upon receiving user queries.ย As shown in the following diagram, API Gateway acts as a broker between the front end and the OpenSearch Serverless collection.ย When the user queries the front-end webpage, API Gateway passes requests to the Python Lambda function, which runs the queries on the OpenSearch Serverless collection and returns the search results.
To get started with the search application, you must first upload the relevant dataset, a movie catalog in this case, to the OpenSearch collection and index them to make them searchable.
Create a collection in OpenSearch Serverless
Aย collectionย in OpenSearch Serverless is a logical grouping of one or more indexes that represent a workload. You can create a collection using theย AWS Management Consoleย orย AWS Software Development Kitย (AWS SDK). Follow the steps inย Preview: Amazon OpenSearch Serverless โ Run Search and Analytics Workloads without Managing Clustersย to create and configure a collection in OpenSearch Serverless.
Create an index and ingest data
After your collection is created and active, you can upload the movie data to an index in this collection. Indexes hold documents, and each document in this example represents a movie record. Documents are comparable to rows in the database table. Each document (the movie record) consists of 10 fields that are typically searched for in a movie catalog, like the director, actor, release date, genre, title, or plot of the movie. The following is a sample movie JSON document:
{
"directors": ["David Yates"],
"release_date": "2011-07-07T00:00:00Z",
"rating": 8.1,
"genres": ["Adventure", "Family", "Fantasy", "Mystery"],
"plot": "Harry, Ron and Hermione search for Voldemort's remaining Horcruxes in their effort to destroy the Dark Lord.",
"title": "Harry Potter and the Deathly Hallows: Part 2",
"rank": 131,
"running_time_secs": 7800,
"actors": ["Daniel Radcliffe", "Emma Watson", "Rupert Grint"],
"year": 2011
}
For the search catalog, you can upload theย sample-movies.bulkย dataset sourced from theย Internet Movies Databaseย (IMDb). OpenSearch Serverless offers the same ingestion pipeline and clients to ingest the data as OpenSearch Service, such as Fluentd, Logstash, and Postman. Alternatively, you can use the OpenSearch Dashboards Dev Tools to ingest and search the data without configuring any additional pipelines. To do so, log in to OpenSearch Dashboards using yourย SAML credentialsย and chooseย Dev tools.
To create a new index, use the PUT command followed by the index name:
PUT movies-index
A confirmation message is displayed upon successful creation of your index.
After the index is created, you can ingest documents into the index. OpenSearch provides the option to ingest multiple documents in one request using theย _bulkย request. Enterย POST /_bulk
ย in the left pane as shown in the following screenshot, then copy and paste the contents of theย sample-movies.bulk
ย file you downloaded earlier.
You have successfully created the movies index and uploaded 1,500 records into the catalog! Now letโs integrate the movie catalog with your search application.
Integrate the Lambda function with an OpenSearch Serverless endpoint
In this step, you create a Lambda function that queries the movie catalog in OpenSearch Serverless and returns the result. For more information, see ourย tutorialย on creating a Lambda function for connecting to and querying an OpenSearch Service domain. You can reuse the same code by replacing the parameters to align to OpenSearch Serverlessโs requirements. Replaceย <my-region>ย with your corresponding region (for example,ย us-west-2
), useย aoss
ย instead ofย es
ย for service, replaceย <hostname>ย with the OpenSearch collection endpoint, andย <index-name>ย with your index (in this case,ย movies-index
).
The following is a snippet of the Lambda code. You can find the complete code in theย tutorial.
import boto3
import json
import requests
from requests_aws4auth import AWS4Auth
region = '<my-region>'
service = 'aoss'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = '<hostname>'
# The OpenSearch collection endpoint
index = '<index-name>'
url = host + '/' + index + '/_search'
# Lambda execution starts here
def Lambda_handler(event, context):
This Lambda function returns a list of movies based on a search string (such as movie title, director, or actor) provided by the user.
Next, you need to configure the permissions in OpenSearch Serverlessโs data access policy to let the Lambda function access the collection.
- On the Lambda console, navigate to your function.
- On theย Configurationย tab, in theย Permissionsย section, underย Execution role, copy the value forย Role name.
- Add this role name as one of the principals of yourย
movie-search
ย collectionโs data access policy.
Principals can beย AWS Identity and Access Managementย (IAM) users, role ARNs, or SAML identities. These principals must be within the current AWS account.
After you add the role name as a principal, you can see the role ARN updated in your rule, as show in the following screenshot.
Now you can grant collection and index permissions to this principal.
For more details about data access policies, refer toย Data access control for Amazon OpenSearch Serverless. Skipping this step or not running it correctly will result in permission errors, and your Lambda code wonโt be able to query the movie catalog.
Configure API Gateway
API Gateway acts as a front door for applications to access the code running on Lambda. To create, configure, and deploy the API for the GET method, refer to the steps in theย tutorial. For API Gateway to pass the requests to the Lambda function, configure it as aย triggerย to invoke the Lambda function.
The next step is to integrate it with the front end.
Test the web application
To build the front-end UI, you can download the following sampleย JavaScript web service. Open theย scripts/search.js
ย file and update theย apigatewayendpoint
ย variable to point to your API Gateway endpoint:
var apigatewayendpoint = 'https://kxxxxxxzzz.execute-api.us-west-2.amazonaws.com/opensearch-api-test/';
// Update this variable to point to your API Gateway endpoint.
You can access the front-end application by openingย index.html
ย in your browser. When the user runs a query on the front-end application, it calls API Gateway and Lambda to serve up the content hosted in the OpenSearch Serverless collection.
When you search the movie catalog, the Lambda function runs the following query:
# Put the user query into the query DSL for more accurate search results.
# Note that certain fields are boosted (^).
query = {
"size": 25,
"query": {
"multi_match": {
"query": event['queryStringParameters']['q'],
"fields": ["title", "plot", "actors"]
}
}
}
The query returns documents based on a provided query string. Letโs look at the parameters used in the query:
- sizeย โ Theย
size
ย parameter is the maximum number of documents to return. In this case, a maximum of 25 results is returned. - multi_matchย โ You use a match query when matching larger pieces of text, especially when youโre using OpenSearchโs relevance to sort your results. With aย
multi_match
ย query, you can query across multiple fields specified in the query. - fieldsย โ The list of fields you are querying.
In a search for โHarry Potter,โ the document with the matching term both in theย title
ย andย plot
ย fields appears higher than other documents with the matching term only in theย title
ย field.
Congratulations! You have configured and deployed a search application fronted by API Gateway, running Lambda functions for the queries served by OpenSearch Serverless.
Clean up
To avoid unwanted charges, delete theย OpenSearch Service collection, Lambda function, and API Gateway that you created.
Conclusion
In this post, you learned how to build a simple search application using OpenSearch Serverless. With OpenSearch Serverless, you donโt have to worry about managing the underlying infrastructure. OpenSearch Serverless supports the same ingestion and query APIs as the OpenSearch Project. You can quickly get started by ingesting the data into your OpenSearch Service collection, and then perform searches on the data using your web interface.
In subsequent posts, we dive deeper into many other search queries and features that you can use to make your search application even more effective.
We would love to hear how you are building your search applications today. If youโre just getting started with OpenSearch Serverless, we recommend getting hands-on with theย Getting started with Amazon OpenSearch Serverlessย workshop.
About the authors
Aish Gunasekar is a Specialist Solutions architect with a focus on Amazon OpenSearch Service. Her passion at AWS is to help customers design highly scalable architectures and help them in their cloud adoption journey. Outside of work, she enjoys hiking and baking.
Pavani Baddepudi is a senior product manager working in search services at AWS. Her interests include distributed systems, networking, and security.
https://aws.amazon.com/blogs/big-data/build-a-search-application-with-amazon-opensearch-serverless/