Elastic Inference Service

Serverless Stack Self-Managed Unavailable

The Elastic Inference Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster. With EIS, you don't need to manage the infrastructure and resources required for machine learning inference by adding, configuring, and scaling machine learning nodes. Instead, you can use machine learning models for ingest, search, and chat independently of your Elasticsearch infrastructure.

AI features powered by EIS

Your Elastic deployment or project comes with a default Elastic Managed LLM connector. This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
You can use ELSER to perform semantic search as a service (ELSER on EIS). Stack Preview 9.1.0 Serverless Preview

Region and hosting

Requests through the Elastic Managed LLM are currently proxying to AWS Bedrock in AWS US regions, beginning with us-east-1. The request routing does not restrict the location of your deployments.

ELSER requests are managed by Elastic's own EIS infrastructure and are also hosted in AWS US regions, beginning with us-east-1. All Elastic Cloud hosted deployments and serverless projects in any CSP and region can access the endpoint. As we expand the service to Azure and GCP and more regions, we will automatically route requests to the same CSP and closest region the Elaticsearch cluster is hosted on.

ELSER via Elastic Inference Service (ELSER on EIS)

Serverless Preview Stack Preview 9.1.0

ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect significantly better performance for throughput and consistent search latency as compared to ML nodes, and will continue to benchmark, remove limitations and address concerns as we move towards General Availability.

Elastic Inference Service

AI features powered by EIS

Region and hosting

ELSER via Elastic Inference Service (ELSER on EIS)

Using the ELSER on EIS endpoint

Get started with semantic search with ELSER on EIS

Limitations

Uptime

Throughput and latency

Batch size

Rate limits

Pricing