We have been digging into OpenEvidence, a well-funded tool that is a retrieval-augmented generation (RAG) tool designed to only pull information from a pool of medical literature retrieval with an AI language model with the goal to generate clinically grounded, evidence-based answers.
When looking at a new product, it is helpful to compare it to another. So, here is a quick comparison of UpToDate and OpenEvidence, using some standard criteria for evaluating information. (The Netter Library has a guide on evaluating resources.)
UpToDate
OpenEvidence
Currency
UpToDate entries list both the date the article was last updated, as well as when literature search was last reviewed. And they have information available that explains their review process.
Unknown. Presumably the pool of information that the tool is using is updated on an ongoing basis, and we do see newer articles as references in its results.
Anecdotally, some professors we have talked to who have been testing it report that they are seeing it miss newer articles or newer drugs or treatments.
Relevance
UpToDate is the market leader for clinical decision support tools, and is widely used throughout US healthcare and around the world.
As a tool, OpenEvidence is limited to medical-related topics and pools only from a pool of medically relevant literature, so it should provide better results than a general large language model like a ChatGPT or CoPilot.
Authority
Each UpToDate entry has information on the subject experts who have written the article, including conflicts of interest. Their editorial process outlines their peer- and expert-review process, and citations are provided.
Founded in 1992, UpToDate is a widely studied tool, with many peer-reviewed articles published on its utility, and also problems with things like conflicts of interest, etc.
Information on the team behind OpenEvidence is available. While the company has talked about their process and under what circumstances they will not provide treatment recommendations, it is clear the information provided is not peer- or expert-reviewed. Rather, it is coming from a finely-tuned AI tool that is pulling from peer-reviewed literature.
Founded in 2022, OpenEvidence only currently has 18 articles in PubMed, so not a lot of research available on it yet.
Accuracy
Articles are written by content experts and there is a clear review and editorial process, along with a grading system for treatment recommendations. The editorial process outlines the types of databases and information sources that are searched in order to synthesize the evidence.
We do not know what exactly is in their training data, or when new data is added.
They have announced partnerships with NEJM and JAMA. They were launched out of the Mayo Clinic's Platform_Accelerate program and was able to train on their 32M de-identified patient records. In interviews they say that it is limited to peer-review literature, and that it was searching across 35M peer-reviewed publications (as of February 2025). We do not know if that is the full text of all of those papers, or if that is using the data in a PubMed record (title, abstract, etc). We do not know if the data pool contains retracted papers, or papers from low/no quality paper mills.
Purpose
The stated purpose is to provide a clinical decision support tool, and there are published editorial guidelines that outline processes to provide high quality and bias free information.
The business model is based on very expensive institutional or individual subscriptions. So, if it does not perform well, people will cancel.
The stated purpose is also to provide a clinical decision tool, and they are working to ensure reliability and accuracy.
OpenEvidence is, however, ad-supported, which is a factor to consider. Similar to how conflicts of interest of human authors can skew recommendations towards particular drugs, advertising could potentially skew results as well--Particularly when the financial backers of OpenEvidence will eventually want to see the company focus on profits rather than product development and market share. They do have an advertising policy, which is a good sign.
Access
Available through the library to anyone at QU. Mobile app also available. For those not affiliated with an institution that provides it, UpToDate costs between $219 to $579 a year.
Free. The number of queries is severely limited for those without a National Provider Identifier (NPI) or who are not a verified health care student.
In short, we think OpenEvidence is a fascinating product that shows great promise, but there are some questions. It is a new tool and the quality will likely only increase. Like with any AI tool, though, one needs to confirm the information one gets out of it before trusting it.
We have been digging into OpenEvidence, a well-funded tool that is a retrieval-augmented generation (RAG) tool designed to only pull information from a pool of medical literature retrieval with an AI language model with the goal to generate clinically grounded, evidence-based answers.
When looking at a new product, it is helpful to compare it to another. So, here is a quick comparison of UpToDate and OpenEvidence, using some standard criteria for evaluating information. (The Netter Library has a guide on evaluating resources.)
UpToDate
OpenEvidence
Unknown. Presumably the pool of information that the tool is using is updated on an ongoing basis, and we do see newer articles as references in its results.
Anecdotally, some professors we have talked to who have been testing it report that they are seeing it miss newer articles or newer drugs or treatments.
Each UpToDate entry has information on the subject experts who have written the article, including conflicts of interest. Their editorial process outlines their peer- and expert-review process, and citations are provided.
Founded in 1992, UpToDate is a widely studied tool, with many peer-reviewed articles published on its utility, and also problems with things like conflicts of interest, etc.
Information on the team behind OpenEvidence is available. While the company has talked about their process and under what circumstances they will not provide treatment recommendations, it is clear the information provided is not peer- or expert-reviewed. Rather, it is coming from a finely-tuned AI tool that is pulling from peer-reviewed literature.
Founded in 2022, OpenEvidence only currently has 18 articles in PubMed, so not a lot of research available on it yet.
We do not know what exactly is in their training data, or when new data is added.
They have announced partnerships with NEJM and JAMA. They were launched out of the Mayo Clinic's Platform_Accelerate program and was able to train on their 32M de-identified patient records. In interviews they say that it is limited to peer-review literature, and that it was searching across 35M peer-reviewed publications (as of February 2025). We do not know if that is the full text of all of those papers, or if that is using the data in a PubMed record (title, abstract, etc). We do not know if the data pool contains retracted papers, or papers from low/no quality paper mills.
The stated purpose is to provide a clinical decision support tool, and there are published editorial guidelines that outline processes to provide high quality and bias free information.
The business model is based on very expensive institutional or individual subscriptions. So, if it does not perform well, people will cancel.
The stated purpose is also to provide a clinical decision tool, and they are working to ensure reliability and accuracy.
OpenEvidence is, however, ad-supported, which is a factor to consider. Similar to how conflicts of interest of human authors can skew recommendations towards particular drugs, advertising could potentially skew results as well--Particularly when the financial backers of OpenEvidence will eventually want to see the company focus on profits rather than product development and market share. They do have an advertising policy, which is a good sign.
In short, we think OpenEvidence is a fascinating product that shows great promise, but there are some questions. It is a new tool and the quality will likely only increase. Like with any AI tool, though, one needs to confirm the information one gets out of it before trusting it.