Once you’ve understood the basics of Spring AI and how to interact with models in the simplest way, let’s explore more advanced concepts to build more powerful applications, taking full advantage of AI’s capabilities.

Some of the features we’ll explore next often go unnoticed but are crucial for improving user experience or integration with other applications. One such example is prompt context, allowing a chatbot to understand indirect references during a conversation:

simonrodriguez@simonrodriguez:~$ ollama run llama3.2:1b »>> cual es la población de España hasta el año 2023? Solo indica el número de habitantes Según los datos del Instituto Nacional de Estadística (INE) de España, en el año 2023 la población total de España es de aproximadamente 47.761.000 habitantes. »>> y que idioma se habla? El idioma oficial de España es el español, y está dividido en varias variantes regionales. Algunos de los idiomas más comunes son: * Español (castellano) * Catalán (catalan) * Gallego (gallego) * Balear (barcelonesa) * Vasconés (vascuno) * Aranés * Asturian * Basque Además, también hay dialectos y variedades regionales del español que no son oficiales pero se hablan en diferentes partes de España.

Translated into English: What is the population of Spain up to the year 2023? Just state the number of inhabitants. According to data from the Spanish National Statistics Institute (INE), the total population of Spain in 2023 is approximately 47,761,000 inhabitants. And what language is spoken? The official language of Spain is Spanish, and it is divided into several regional variants. Some of the most common languages are: Spanish (Castilian), Catalan, Galician, Balearic (Mallorcan), Basque, Aranese, Asturian, Basque

In addition, there are dialects and regional varieties of Spanish that are not official but are spoken in different parts of Spain.

In this example, we can see how the second question doesn’t need to explicitly mention that it refers to Spain.

Another feature we’ll cover is the ability to use “tools” to enrich the model’s information. LLMs are pretrained models, meaning they contain knowledge only up to a certain date, and can’t deliver real-time results out of the box.

simonrodriguez@simonrodriguez:~$ ollama run llama3.2:1b >>> tell me your cut off date My knowledge cutoff is currently December 2023, but I have access to more recent information via internet search. >>> Send a message (/? for help)

Popular applications like ChatGPT, DeepSeek, and others also retrieve real-time data through external functionalities beyond the LLM itself.

Advisors

Advisors allow you to intercept, modify, and enrich model interactions. Benefits include the use of common AI patterns, transformation of input/output data, and portability across models and use cases.

Core Components

As seen in the image below, the API is built around CallAroundAdvisor and CallAroundAdvisorChain classes (and their streaming equivalents), plus AdvisedRequest and AdvisedResponse for handling request/response. These also include an adviseContext to share state throughout the advisor chain.

The API is based on CallAroundAdvisor and CallAroundAdvisorChain (and their streaming versions), as well as AdvisedRequest and AdvisedResponse, which allow interaction with the request and response. These also include an adviseContext to share state throughout the chain.

NextAroundCall and nextAroundStream are the key methods to examine, customize, and forward prompts, block requests, inspect responses, and raise exceptions. There are also methods like getOrder (to control execution order) and getName (to assign a unique advisor name).

The Spring AI advisor chain executes advisors sequentially based on their getOrder. Lower values run first; the last advisor sends the request to the LLM. The image below shows the interaction flow between advisors and the model.

Spring AI creates an AdvisedRequest from the user's prompt with an empty AdvisorContext. Each Advisor in the chain processes the request, modifying or blocking it. The last Advisor sends the request to the Chat Model. The model's response is sent back as an AdvisedResponse through the chain. Each Advisor can modify it. The final AdvisedResponse is returned to the client.
  1. Spring AI creates an AdvisedRequest from the user’s prompt with an empty AdvisorContext.
  2. Each Advisor processes the request, modifying or blocking it (if blocked, it must return a response).
  3. The last framework-created Advisor sends the request to the Chat Model.
  4. The model’s response is returned to the Advisors as an AdvisedResponse.
  5. Each Advisor can process or modify the response.
  6. The AdvisedResponse is sent back to the client.

Advisor Execution Order

Advisor execution order in the chain is defined by the getOrder method. Key points:

If you need an Advisor to run first in both the request and the response:

Implementing an Advisor

To create an Advisor, you must implement CallAroundAdvisor and its aroundCall method:

public class CustomAdvisor implements CallAroundAdvisor {

    @Override
    public String getName() {
        return this.getClass().getSimpleName();
    }

    @Override
    public int getOrder() {
        return 0;
    }

    @Override
    public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {
        AdvisedResponse originalResponse = chain.nextAroundCall(this.beforeCall(advisedRequest));
        return this.afterCall(originalResponse);
    }
//Methods to complete the behaviour
}

Advisors Provided by the Framework

Spring AI provides several built-in Advisors:

  1. Chat Memory:
memoryChatClient = chatClientBuilder
    .clone()
    .defaultAdvisors(new MessageChatMemoryAdvisor(new InMemoryChatMemory()))
    .build();
  1. Question Answering:
  1. Content Safety:

Best Practices

Structured Output

This API helps convert model output into a structured format to facilitate integration with other components/applications.

Before making the call to the model, the Converter includes instructions in the prompt for the model to generate the response in the desired format. The Converter then transforms the output into instances of the required format, such as JSON, XML, or domain entities.

Before calling the model, the Converter adds instructions to the prompt so that the model generates the response in the desired format. Afterwards, the Converter transforms the output into instances of the required format, such as JSON, XML, or domain entities.

Converters Provided by Spring AI

Spring AI currently provides the following Converter implementations:

AbstractConversionServiceOutputConverter, AbstractMessageOutputConverter, BeanOutputConverter, MapOutputConverter, ListOutputConverter
@GetMapping("/map")
Map<String, Object> getStandardOutputMap() {


    MapOutputConverter outputConverter = new MapOutputConverter();

    String format = outputConverter.getFormat();
    String template = """
            Give me a list of the 3 most important cryptocurrencies with their name, abbreviation and a brief description.
            Example: Bitcoin -> "abbreviation": "BTC", "description": "the most important cryptocurrency".
            {format}
            """;


    PromptTemplate promptTemplate = new PromptTemplate(template, Map.of("format", format));
    Prompt prompt = new Prompt(promptTemplate.createMessage());


    return chatClient.prompt(prompt)
            .call()
            .entity(outputConverter);
}

Tool Calling

The integration with "tools" allows the model to execute client-side functions to access information or perform tasks when necessary. Its main use cases are:

It’s important to note that this pattern is not a built-in capability of the models, but rather a feature provided by the application itself. The model requests the tool with specific arguments, but the function is executed by the application.

How It Works

The key concepts and components for Tool Calling are:

Define the “tool” to be provided to the model. This includes its name, description, and the input parameter schema. When the model decides to call a tool, it sends a response with the tool's name and parameters adapted to the schema. The application executes the tool, processes the result, and returns it to the model. The model then uses it to craft the final response.
  1. Define the "tool" you want to provide to the model, including its name, description, and input parameter schema.
  2. When the model decides to call a tool, it sends a response with the tool name and adapted input parameters.
  3. The application executes the tool.
  4. The result is processed by the application.
  5. The application sends the result back to the model.
  6. The model generates the final response using the tool output as additional context.

Next, we’ll explore several ways to define and execute tools.

Methods as Tools

There are two ways to define tools from methods:

@Slf4j
public class BitcoinTool {

    @Tool(description = "Get the current price of bitcoin cripto currency in euros")
    Long getBitcoinPriceInEuros() {

        BitcoinInfo bitcoinInfo = null;
        String apiResponse = "";
        try {
            apiResponse = RestClient.create().get().uri(new URI("https://cex.io/api/last_price/BTC/EUR")).retrieve()
                    .body(String.class);

            bitcoinInfo = new ObjectMapper().readValue(apiResponse, BitcoinInfo.class);

        } catch (URISyntaxException | JsonProcessingException e) {
            e.printStackTrace();
        }

        log.info("The current bitcoin price in euros is {}", bitcoinInfo.getLprice());
        return bitcoinInfo.getLprice();
    }
}
@GetMapping("/btc")
String toolCallingBtcPrice() {


    return this.chatClient.prompt().user("Give me the current bitcoin price")
    .tools(new BitcoinTool()).call().content();
    }

You can also define default “tools” in the ChatClient.Builder using the defaultTools() method (this way, the tool applies to all requests). A similar behavior applies to the ChatModel class.

@Slf4j
public class PurchaseOrderTool {
    
    public void createBitcoinPurchaseOrder(@ToolParam(description = "Bitcoin amount") Integer bitcoinAmount, 
            @ToolParam(description = "Current bitcoin euros price") Long currentBitcoinEurosPrice) {
        log.info("Create a purchase order for {} btc with each btc at {} euros price", bitcoinAmount, currentBitcoinEurosPrice);
    }
}
@GetMapping("/btc-purchase-order")
String toolCallingBtcPriceSavePurchaseOrder() {


    Method method = ReflectionUtils.findMethod(PurchaseOrderTool.class, "createBitcoinPurchaseOrder", Integer.class, Long.class);
    ToolCallback toolCallback = MethodToolCallback.builder()
        .toolDefinition(ToolDefinition.builder(method)
            .description("Create bitcoin purchase order of bitcoin amount at current bitcoin euros price")
            .build())
        .toolMethod(method)
        .toolObject(new PurchaseOrderTool())
        .build();


    return this.chatClient.prompt().user("Give me the current bitcoin price in euros and create a bitcoin purchase order of 10 bitcoins at this current bitcoin euros price")
        .tools(new BitcoinTool())
        .tools(toolCallback)
        .call()
        .content();
}

On the other hand, the following limitations exist when using methods as "tools." Basically, the following types are not supported as parameters or return values for methods:

Functions as tools

Another way to create "tools" is through functions. Just like with methods, there are two ways:

@Configuration(proxyBeanMethods = false)
class WeatherTools {
    
    public static final String CURRENT_WEATHER_TOOL = "currentWeather";
WeatherService weatherService = new WeatherService();

    @Bean(CURRENT_WEATHER_TOOL)
    @Description("Get the weather in location")
    Function<WeatherRequest, WeatherResponse> currentWeather() {
        return weatherService;
    }
}

Despite these capabilities, there are the following limitations when using functions as “tools”, which do not support the following types as input or output parameters:

Note: It is important to provide a well-written description for each “tool,” as this helps the model understand when and how to use it correctly to generate a coherent response.

Tool Execution

The execution of “tools” is managed through a process handled by the ToolCallingManager interface. The default implementation is the DefaultToolCallingManager class, but it can be customized by creating a custom ToolCallingManager bean.

The default flow is shown in the following image:

The definition of the “tool” is included in the chat request (ChatRequest) and the model receives this request. When the model decides to call a “tool,” it sends a response (ChatResponse) with the name of the tool and corresponding parameters. ChatModel sends the request to the ToolCallingManager. The ToolCallingManager identifies and executes the tool. The result is returned to the ToolCallingManager. The ToolCallingManager returns the result to the ChatModel. The ChatModel sends the result to the model. The model generates the final response using the tool's result as additional information.
  1. The definition of the “tool” is included in the chat request (ChatRequest), and the model receives this request.
  2. When the model decides to call a “tool,” it sends a response (ChatResponse) with the name of the tool and the corresponding parameters.
  3. ChatModel forwards the request to the ToolCallingManager.
  4. The ToolCallingManager identifies and executes the tool.
  5. The result of the execution is returned to the ToolCallingManager.
  6. The ToolCallingManager returns the result to the ChatModel.
  7. The ChatModel sends the result to the model.
  8. The model generates the final response using the tool's result as additional context.

Tool Discovery

The main way to use “tools” is through the mechanisms explained in previous sections (Methods and Functions as Tools). However, Spring AI also provides the option to discover tools dynamically at runtime using the ToolCallbackResolver interface along with tool names.

public interface ToolCallbackResolver {

    @Nullable
    ToolCallback resolve(String toolName);

}

By default, instances that implement ToolCallbackResolver include:

Specification

The documentation also details the tool specification for finer control over customization of tools.

Demo

We’ve created an app with the following endpoints to demonstrate the concepts covered so far:

Hi, my name is Jose. Hello! Welcome to our space. How can I help you today, José? Do you have any questions or need anything in particular? I'm here to listen and respond kindly and objectively.
What is my name? I cannot know your name. Is there anything else I can help you with?
Hello, my name is Jose. Hi, Jose! How can I assist you today? Is there anything specific you'd like to discuss or need information about? I'm here to help.

Translated into English: Hello, my name is Jose. Hi, Jose! How can I assist you today? Is there anything specific you'd like to discuss or need information about? I'm here to help.

What is my name? Your name is Jose, as you mentioned earlier. How are you? Is that correct?

Translated into English: What is my name? Your name is Jose, as you mentioned earlier. How are you? Is that correct?

Give me information about Spain. Just one line. as of my knowledge cutoff in december 2023, spain was a european country and a constitutional monarchy governed by king felipe vi since 1975.

Translated into English: Give me information about Spain. Just one line. as of my knowledge cutoff in December 2023, Spain was a European country and a constitutional monarchy governed by King Felipe VI since 1975.

c.e.s.d.s.advisors.CustomAdvisor Input transformed GIVE ME INFORMATION ABOUT SPAIN. JUST ONE LINE. c.e.s.d.s.advisors.CustomAdvisor Request to the model AdvisedRequest chatModel=org.springframework.a1. c.e.s.d.s.advisors.CustomAdvisor Original response AdvisedResponse response=ChatResponse metadata={

Translated into English: c.e.s.d.s.advisors.CustomAdvisor Input transformed GIVE ME INFORMATION ABOUT SPAIN. JUST ONE LINE. c.e.s.d.s.advisors.CustomAdvisor Request to the model AdvisedRequest chatModel=org.springframework.a1. c.e.s.d.s.advisors.CustomAdvisor Original response AdvisedResponse response=ChatResponse metadata={

{ "Cryptocurrencies": { "Bitcoin": { "symbol": "BTC", "description": "the most important cryptocurrency" { "Ethereum": { "symbol": "ETH", "description": "the second most important cryptocurrency" "Tether": { "symbol": "USDT", "description": "strong reserve currency" }

Translated into English: { "Cryptocurrencies": { "Bitcoin": { "symbol": "BTC", "description": "the most important cryptocurrency" { "Ethereum": { "symbol": "ETH", "description": "the second most important cryptocurrency" "Tether": { "symbol": "USDT", "description": "strong reserve currency" }

"ethereum": "{"symbol":"ETH", "description":"the second most important cryptocurrency"}", "litecoin": "{"symbol":"LTC","description":"the third most important cryptocurrency"}", "bitcoin": "{"symbol":"BTC", "description":"the most important cryptocurrency"}"

Translated into English: "ethereum": "{"symbol":"ETH", "description":"the second most important cryptocurrency"}", "litecoin": "{"symbol":"LTC","description":"the third most important cryptocurrency"}", "bitcoin": "{"symbol":"BTC", "description":"the most important cryptocurrency"}"

“Bitcoin”, “Ethereum”, “Bitcoin Cash”, “Litecoin”, “Dogecoin”,
I'm not able to provide real-time financial data or current prices for cryptocurrencies like Bitcoin. However, you can check popular exchange sites or financial news platforms for up-to-date information.
The current Bitcoin price is approximately 63600 EUR. Prices may change rapidly. This is for informational purposes only and not investment advice.
c.e.s.d.springai demo.tools.BitcoinTool : The current bitcoin price in euros is 63600 c.e.s.d.s.tools.PurchaseOrderTool : Create a purchase order for 10 btc with each btc at 1000 euros price
Creating bitcoin purchase order with function criptomarketservice criptomarketResponse/currency=BTC, amount=2, currencyPrice=80000, totalPrice=160000)

You can download the source code of this example app here.

Conclusion

In this Spring AI chapter, we explored various ways to enrich or transform model input and output to generate coherent responses — even using real-time information, although there are still some details to polish.

In upcoming posts, we’ll dive into RAG and the capabilities that support it.

References

Tell us what you think.

Comments are moderated and will only be visible if they add to the discussion in a constructive way. If you disagree with a point, please, be polite.

Subscribe