Business
Pinecone vector database can now handle hybrid keyword-semantic search
When Pinecone announced a vector database at the beginning of last year, it was building something that was specifically designed for machine learning and aimed at data scientists. The idea was that you could query this data in a format that machines understand, making it much faster.
Originally this involved semantic searches where users could search based on meaning instead of specific words. It turns out, however, that as people put Pinecone to work, there were use cases where specific keywords mattered, and today the company announced that it’s now possible to conduct searches combining both semantic and keyword searches, what company founder and CEO Edo Liberty calls hybrid search.
“We’ve conducted a lot of research on this topic and we found that, in fact, hybrid search ends up being better [in many cases]. It’s better in the sense that if you can combine both semantic search, this is the deep NLP encoding of sentences that gets the context and the meaning and so on, but you can also infuse that with specific keywords…the combination of those two ends up being significantly better,” Liberty told TechCrunch.
In fact he says the two complement each other well, especially in cases where industry-specific terms matter. This could be something like a doctor searching for keywords related to a specific disease. In those cases, the medical context may return better results by combining a question and some specific keywords around a given disease.
He says that the keywords never take precedence over the semantic question the user is asking, but they provide some extra information to help return more meaningful results.
“You might know exactly what you’re looking for, and you might be able to provide extra oomph when you make your semantic search keyword-aware – and that actually helps a lot. So I don’t want to throw away the good parts of keyword search [by relying completely on semantic search]. I don’t want the keywords to be in the driver’s seat, but I don’t to ignore them completely either,” he said.
As Liberty told us at the time of the company’s $28 million Series A last year, search has become a big use case for the company:
“The predominant use of the vector databases is for search, and search in the broad sense of the word. It’s searching through documents, but you can think about search as information retrieval in general, discovery, recommendation, anomaly detection and so on,” he said at the time.
Pinecone launched in 2019 and has raised $38 million, per Crunchbase.
-
Entertainment6 days ago
WordPress.org’s login page demands you pledge loyalty to pineapple pizza
-
Entertainment7 days ago
The 22 greatest horror films of 2024, and where to watch them
-
Entertainment6 days ago
Rules for blocking or going no contact after a breakup
-
Entertainment5 days ago
‘Mufasa: The Lion King’ review: Can Barry Jenkins break the Disney machine?
-
Entertainment5 days ago
OpenAI’s plan to make ChatGPT the ‘everything app’ has never been more clear
-
Entertainment4 days ago
‘The Last Showgirl’ review: Pamela Anderson leads a shattering ensemble as an aging burlesque entertainer
-
Entertainment5 days ago
How to watch NFL Christmas Gameday and Beyoncé halftime
-
Entertainment3 days ago
Polyamorous influencer breakups: What happens when hypervisible relationships end