Business

Arthur releases open source tool to help businesses find the greatest LLM for a job

Published

1 year ago

August 17, 2023

Ron Miller

Arthur, a machine learning morning monitoring startup, has benefited from the interest in generative AI this year, and it has been developing tools to help businesses work with LLMs more effectively. Today it is releasing Arthur Bench, an open source tool to help users find the greatest LLM for a particular set of data.

Adam Wenchel, CEO and co-founder at Arthur says that the company has seen a lot of interest in generative AI and LLMs, and so they have been putting a lot of effort into creating products.

He says that today, and granted we are less than a year since the release of ChatGPT, that businesses don’t have an organized way to measure the effectiveness of one tool against another, and that’s why they created Arthur Bench.

“Arthur Bench solves one of the critical problems that we just hear with every customer which is [with all of the model choices], which one is greatest for your particular application,” Wenchel told TechCrunch.

It comes with a suite of tools you can use to methodically test the performance, but the real value is that it allows you to test and measure how the types of prompts your users would use for your particular application will perform against different LLMs.

Arthur Bench LLM comparison test suite hedging test.

Image Credits: Arthur

“You could potentially test 100 different prompts, and then see how two different LLMs – like how Anthropic compares to OpenAI – on the kinds of prompts that your users are likely to use,” Wenchel said. What’s more, he says that you can do that at scale and make a better decision on which model is greatest for your particular use case.

Arthur Bench is being released today as an open source tool. There will also be a SaaS version for customers who don’t want to deal with complexity of managing the open source version, or who have larger test requirements, and are willing to pay for that. But for now, Wenchel said they are concentrating on the open source project.

The new tool comes on the heels of the release of Arthur Shield in May, a kind of LLM firewall that is designed to detect hallucinations in models, while protecting against toxic information and private data leaks.

Related Topics:featured

Up Next

K-pop stars KARD on their favorite musicals, raising lions, and reading ‘nasty’ thirst tweets

Don't Miss

‘Birth/Rebirth’ review: A chilling ‘Frankenstein’ for the post-Roe v. Wade era

Entertainment7 days ago

I went to the ‘Severance’ pop-up in Grand Central Station. It was wild.

Entertainment6 days ago

What’s new to streaming this week? (Jan. 17, 2025)

Entertainment6 days ago

Explainer: Age-verification bills for porn and social media

Entertainment5 days ago

If TikTok is banned in the U.S., this is what it will look like for everyone else

Entertainment5 days ago

‘Night Call’ review: A bad day on the job makes for a superb action movie

Entertainment5 days ago

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

Entertainment5 days ago

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

Entertainment5 days ago

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

The Televisor

Arthur releases open source tool to help businesses find the greatest LLM for a job

Business

Arthur releases open source tool to help businesses find the greatest LLM for a job

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

What’s new to streaming this week? (Dec. 27, 2024)

2025’s public domain works and how you can use them, from Popeye to ‘The Sound and the Fury’

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

The greatest ’90s films on Prime Video

How to watch ‘Wicked’ at home: Release date, streaming deals, and more

CES 2025 highlights: 12 new gadgets you can buy already

‘American Primeval’ review: Can Netflix’s grimy Western mini-series greatest ‘Yellowstone’?

Tesla launched the new Model Y in China. Here’s what you need to know

Eight ways Mark Zuckerberg changed Meta ahead of Trump’s inauguration

Meta ditches fact-checking for community notes ahead of second Trump term

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

Trending

The Televisor

Arthur releases open source tool to help businesses find the greatest LLM for a job

You may like

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

What’s new to streaming this week? (Dec. 27, 2024)

2025’s public domain works and how you can use them, from Popeye to ‘The Sound and the Fury’

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

The greatest ’90s films on Prime Video

How to watch ‘Wicked’ at home: Release date, streaming deals, and more

CES 2025 highlights: 12 new gadgets you can buy already

‘American Primeval’ review: Can Netflix’s grimy Western mini-series greatest ‘Yellowstone’?

Tesla launched the new Model Y in China. Here’s what you need to know

Eight ways Mark Zuckerberg changed Meta ahead of Trump’s inauguration

Meta ditches fact-checking for community notes ahead of second Trump term

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

Trending