A quick guide to tool-calling in LLMs

Hands on Let's say you're tasked with solving a math problem like 4,242 x 1,977. Some of you might be able to do this in your head, but most of us would probably be reaching for a calculator right about now, not only because it's faster, but also to minimize the potential for error.

As it turns out, this same logic applies to large language models (LLMs). Ask a chatbot to solve that same math problem, and in most cases, it'll generate a plausible but wrong answer. But, give that model its own calculator and, with the right programming, suddenly it can accurately solve complex equations.

This isn't limited to arithmetic, either. The right tools can give LLMs the ability to execute arbitrary code, access APIs, pass off complex requests to domain-specific models, or even search the web.

These tools are one of the building blocks to achieving what folks have taken to calling "agentic AI." The idea is that given the right tools, AI models can break down, plan, and solve complex problems with limited to no supervision.

We'll leave it up to you to decide how much control you want to give these models, what guardrails they need, and how much you want to trust them. We'll focus on the mechanics today.

And so in this hands on, we'll be exploring some of the ways tool-calling can be used to augment the capabilities and address the limitations of LLMs.

What you'll need:

A system capable of running modest LLMs at 4-bit quantization. Any modern Nvidia graphics card or an AMD 7000-series GPU with at least 6GB of vRAM should do the trick. For Apple Silicon Macs, we recommend at least 16GB of memory.
A basic understanding of Python.
This guide also assumes that you already have the Ollama model runner installed. If you don't, you can find our guide here.
You'll also want to follow our instructions on deploying the Open WebUI dashboard here, as it provides a relatively friendly interface for deploying custom tools.

Choosing a model

Before we go any further, it's important to discuss model compatibility. Not all models support tool or function calling just yet.

As we understand it, Mistral-7B officially added tool support in Instruct version 0.3 in May, while Meta introduced the functionality with the release of Llama 3.1 last month. Other notable models with tool support available from Ollama's repos include Mistral's NeMo, Large, and Mixtral models and Cohere's Command R+.

In this guide, we'll be looking at Mistral-NeMo, but if you're running low on resources, we can confirm Llama 3.1-8B works and can fit into as little as 6GB of memory.

Building out our LLM toolbox

Going back to our example earlier it might be a good idea to give our LLM of choice a calculator to help it cope with the occasional math problem.

To create a new tool, open the Workspace tab, navigate to Tools, and click the + button to create a new one. Click to enlarge

To get started, head over to the "Workspace" tab in the Open WebUI sidebar, open the "Tools" section, and create a new tool.

The default script provides functions for a calculator, user information gathering, time and date, and a weather API. Click to enlarge

By default, Open WebUI will populate the field with an example script that adds a variety of useful tools, including:

a function to retrieve your name and email from Open WebUI
a clock and calendar
a calculator
a weather app

You can leave this script unchanged, but to make it easier to see when it's working, we can add self.citation = True under def __init__(self): like so:

class Tools:
    def __init__(self):
        self.citation = True
        pass

Finally, give it a name like "BasicTools" and a brief description and click save.

Remember to enable your tool to make it available to your model - Click to enlarge

To put our tools to use, select your model and start a new chat, then press the "+" icon to the left of the message box and enable BasicTools. This will tell the model to use these tools wherever appropriate for the duration of your conversation.

Turns out, if you just give an LLM a calculator it won't start halluncinating an answer. Click to enlarge

Now, if we prompt the model with our math problem from earlier, we'll see that it not only responds with the correct answer, but also shows the tool it used to arrive at that figure.

Depending on the context of your conversation, multiple tools may be called to address your request.

Breaking it down

So, what's going on here? At least in Open WebUI, tools are defined as Python scripts. When formatted correctly, they can be called by the model to solve specific problems.

To make this easier to understand, we pulled this example from the Open WebUI demo we looked at in the last step. It uses Python's datetime libraries to give the model a sense of time.

import os
from datetime import datetime
class Tools:
    def __init__(self):
        self.citation = True # Add this if you want OpenWebUI to report when using a tool.
        pass
    def get_current_time(self) -> str:
        """
        Get the current time in a more human-readable format.
        :return: The current time.
        """
        now = datetime.now()
        current_time = now.strftime("%I:%M:%S %p")  # Using 12-hour format with AM/PM
        current_date = now.strftime(
            "%A, %B %d, %Y"
        )  # Full weekday, month name, day, and year
        return f"Current Date and Time = {current_date}, {current_time}"

Aside from the get_current_time function that actually does the work of retrieving the date and time, there are two elements to be aware of.

The first is the Tools class, which tells the model what functions are available for it to call on.

class Tools:
    def __init__(self):
        self.citation = True # Add this if you want OpenWebUI to report when using a tool.
        pass

The second element here is actually the docstring directly under the primary function. This doesn't just tell us what the function does, but it also provides the LLM instructions on what to do with the code.

"""
Get the current time in a more human-readable format.
:return: The current time.
"""

If you're struggling with your model being too conservative in its use of tools, we found it can be helpful to expand your docstrings with instructions on how, when, and in what format the model should utilize the tool.

Connect your LLM to anything

In addition to basic functions like clocks, calendars, and calculators, these tools can tie into just about anything with an exposed API.

Beyond retrieving data from remote sources, API-calls can be used to automate all kinds of things, including hypervisors like Proxmox.

In this example, we cobbled together a tool that allows the LLM to connect to our Proxmox cluster's API using the Proxmoxer Python module and gathers information on resource utilization.

"""
title: Proxmox-Status-Report
version: 1.0
description: API call to get status report from proxmox.
requirements: proxmoxer
"""
from proxmoxer import ProxmoxAPI
class Tools:
    def __init__(self):
        pass
    def proxmox_resource_report(self, node: str = None) -> str:
        """
        Checks the resource utilization level of Proxmox nodes in a cluster based on a given node name.
        Examples of how this information might be requested:
        - "How are my nodes doing"
        - "Give me a status report of my homelab"
        - "Check the resource utilization on Jupiter"
        :param node: The name of a Proxmox node. Node names will be undercased. If node name is not known, provided utilization data for all nodes in cluster.
        :return: A string containing the resource utilization.
        """
        try:
            # Connect to the Proxmox API using the API token
            proxmox = ProxmoxAPI(
                "proxmox_host",
                user="username@realm",
                token_name="token_name",
                token_value="your_token_here",
                verify_ssl=False, #Comment out if using SSL
            )
            # Get the cluster resources
            cluster_resources = proxmox.cluster.resources.get()
            result = []
            for resource in cluster_resources:
                if resource["type"] == "node" and (
                    node is None or resource["node"] == node
                ):
                    node_name = resource["node"]
                    # Extract the relevant system information
                    cpu_cores = resource["maxcpu"]
                    memory_gb = round(resource["maxmem"] / 1024**3, 2)
                    disk_gb = round(resource["disk"] / 1024**3, 2)
                    disk_size = round(resource["maxdisk"] / 1024**3, 2)
                    # Get the node's resource utilization
                    node_status = proxmox.nodes(node_name).status.get()
                    cpu_usage = round(node_status["cpu"] * 100, 2)
                    memory_usage = round(
                        node_status["memory"]["used"]
                        / node_status["memory"]["total"]
                        * 100,
                        2,
                    )
                    disk_usage = round(resource["disk"] / resource["maxdisk"] * 100, 2)
                    # Collect the information in a formatted string
                    result.append(
                        f"Node: {node_name} | CPU Cores: {cpu_cores} | CPU Usage: {cpu_usage}% | "
                        f"Memory: {memory_gb} GB | Memory Usage: {memory_usage}% | "
                        f"Disk: {disk_gb} GB of {disk_size} GB | Disk Usage: {disk_usage}%"
                    )
            if not result:
                return f"No data found for node '{node}'." if node else "No data found."
            return "\n".join(result)
        except Exception as e:
            return f"An unexpected error occurred: {str(e)}"
# Example usage
if __name__ == "__main__":
    tools = Tools()
    result = tools.proxmox_resource_report("saturn")
    print(result)

Once saved in Open-WebUI, we can simply enable it in the chat and ask it for a health report on our Proxmox cluster. In this case, everything looks healthy.

Using API calls in your tools, you can automate a variety of tasks like generating a health report for your Proxmox cluster - Click to enlarge

Of note in this example is the docstring at the top of the script. It's used by Open WebUI to, among other things, fetch relevant Python packages prior to running.

Meanwhile, looking a little lower, we see that the docstring for proxmox_resource_report function tells the model which parameters it should watch for when processing your prompt.

You could go even further than this and define functions for starting, cloning, and managing VMs and LXC containers, with the right safeguards in place, but hopefully this gives you an idea of just how extensible these functions can be.

If you're looking for inspiration, you might want to check out Open WebUI's community page. Just remember not to go blindly pasting code into your dashboard.

Additional resources

Open WebUI's tool-calling implementation is one of the easiest to implement, but it's just one of several out there.

Ollama recently added its own tool-calling functionality
Mistral.rs offers support through OpenAI-compatible API calls.
Hugging Face has Transformers Agents.
LangChain offers a number of prebuilt tools and supports custom ones.

Depending on your use case, one of these may be better suited to your needs. We also expect the number of models trained to take advantage of these tools to grow over the next few months.

The Register aims to bring you more local AI content like this in the near future, so be sure to share your burning questions in the comments section and let us know what you'd like to see next. ®

Editor's note: The Register was provided an RTX 6000 Ada Generation graphics by Nvidia and an Arc A770 GPU by Intel to support stories like this. Neither company had any input as to the contents of this article.