Subscribe to The Podcast by KevinMD. Watch on YouTube. Catch up on old episodes!
Physician executive Tim Wetherill discusses his article, “Why AI is the perfect neutral arbiter for health care claims.” Tim explores how deeply embedded incentives, inefficiencies, and subjectivity in the current claims adjudication process create opportunities for manipulation and inefficiency. He explains how AI, when trained on clear, bias-free rules, can standardize decision-making and eliminate ambiguity, reducing the need for multiple vendors and decreasing billing errors. Tim argues that AI is uniquely positioned to serve as an unbiased third party—improving outcomes for health plans, providers, and ultimately, patients. His insights point to a future where fairness and efficiency drive health care claims resolution.
Our presenting sponsor is Microsoft Dragon Copilot.
Want to streamline your clinical documentation and take advantage of customizations that put you in control? What about the ability to surface information right at the point of care or automate tasks with just a click? Now, you can.
Microsoft Dragon Copilot, your AI assistant for clinical workflow, is transforming how clinicians work. Offering an extensible AI workspace and a single, integrated platform, Dragon Copilot can help you unlock new levels of efficiency. Plus, it’s backed by a proven track record and decades of clinical expertise and it’s part of Microsoft Cloud for Healthcare–and it’s built on a foundation of trust.
Ease your administrative burdens and stay focused on what matters most with Dragon Copilot, your AI assistant for clinical workflow.
VISIT SPONSOR → https://aka.ms/kevinmd
SUBSCRIBE TO THE PODCAST → https://www.kevinmd.com/podcast
RECOMMENDED BY KEVINMD → https://www.kevinmd.com/recommended
Transcript
Kevin Pho: Hi, and welcome to the show. Subscribe at KevinMD.com/podcast. Today we welcome Dr. Tim Wetherill. He is a physician executive. Today’s KevinMD article is “Why AI is a perfect neutral arbiter for health care claims.” Tim, welcome to the show.
Tim Wetherill: Hey, thanks, Kevin. Excited to be here.
Kevin Pho: All right, so briefly tell us a little bit about your story and then about the article itself for those who did not get a chance to read it.
Tim Wetherill: Sure. I am a general surgeon by training. I grew up in the Philly area and then trained at the University of Kansas. I did everything from trauma to robotics to bariatrics in private practice and then moved to the VA, where I served as chief of the VA for Montana for a bit. Then—not by any plan—I gravitated to the payer world one day and worked for some Blues plans for about 10 years. After that, I transitioned to an AI startup, where I am the chief clinical officer today at Machinify.
Kevin Pho: All right, excellent. So for those who did not read your article, “Why AI is a perfect neutral arbiter for health care claims,” tell us what this article is about.
Tim Wetherill: It is really about using AI to read through medical records and claims to find the truth. We all know that medical records—especially for long inpatient stays—can be huge. The average length is around 800 pages now. We have even seen one that was 100,000 pages. That actually happened. You would think that was for someone who was in the hospital for six months, but I believe it was actually for a bill under five thousand dollars—something ridiculous like that. Trying to filter through all that is really difficult.
We also know that humans are humans, and we have error rates and differing ways to interpret things. For example, if you ask two different orthopods how to treat a fracture, you might get different opinions. It is part art, part science, and that comes through when reviewing claims and medical records. It is not completely a hard science. There is bias, and there may also be pressure from above for denials in some organizations. I never personally saw that, but I think some organizations might have it.
The AI is not about that. It just wants to spit out the facts and the truth. What you do with that truth is up to you. Our job is to give the reviewer—whether a nurse or a physician—enough information to decide what actually happened, whether the code is valid, and how the determination should be rendered.
Kevin Pho: Give us some insight in terms of the payer side. Before the advent of AI, how were claims typically evaluated on a routine basis?
Tim Wetherill: Claims are the bills that come in, and they are fairly superficial: ICDs, CPTs or HCPCS codes, patient identification—there is not much there. A lot of that is automated. If they do get flagged, it is usually for reasons that are not great, such as a certain dollar threshold or an unlisted code. Maybe something fishy shows up, like billing for 18 wheelchairs in a single day.
When that happens, it gets handed off to a reviewer, who might be a coding specialist or a clinician. They look at the medical record to decide what actually happened and whether the claim matches the record. Payers still, to this day, have an enormous manual process. It is not very sophisticated. They have a tremendous number of policies, guidelines, and rules to apply—some are payer-specific, and some come from CMS and the federal government. Those rules can be 30-page documents. When you ask a team of 50 clinicians to apply them, you can do the math and figure out what happens.
Kevin Pho: So describe how a particular AI system would work in practice.
Tim Wetherill: Sepsis is probably the most commonly “abused” diagnosis submitted for an inpatient stay, because it pays the most under a DRG system. DRG is a grouping that attempts to pay the hospital based on how many resources are used rather than treating all conditions the same. Because sepsis pays a lot, it is heavily overused, and the criteria can be nebulous and difficult to tabulate.
We work with a payer per its policy and guidelines. We do not invent that science; we are using clinical literature—whether sepsis‑3 criteria or something else. It is based on sound science. We train AI to look through the record and find specific details. Sepsis is about whether there is an infection, a dysregulated immune response, any organ dysfunction, and so on. We categorize that and then give the user an output.
We are not necessarily making a determination—we are not in that business. We still believe this is not self-driving cars just yet, so the nurse or physician who is reviewing the AI output also has the record to validate and confirm that the AI was correct, accurate, and consistent. What we have seen is that clinicians are now reviewing records much more quickly, so providers get paid sooner. Accuracy improves, and consistency absolutely improves, which is good for everybody.
Kevin Pho: Do you find that there is ever any discrepancy between the human reviewer after the AI’s first pass?
Tim Wetherill: Yes, there is always room for error. Sometimes the optical character recognition (OCR) fails because the PDF is messy or “dirty.” Some numbers might get misread. Another example is when we encounter corner cases—the “zebras.” We train the system to call out its own uncertainty or say, “This needs further investigation.” We respect that medicine is not black‑and‑white, binary. You need expertise for interpretation and judgment calls.
What we focus on is getting the easy cases out of the way so the clinicians can use their expertise for the more nuanced ones. Ultimately, it is about less paperwork and more skill-based work.
Kevin Pho: Is there any feedback mechanism from the AI evaluation back to the clinician and the local hospital system, so they can improve documentation to better fit what they are submitting?
Tim Wetherill: There has always been feedback from payers to providers, hospital systems, and doctors. What is crazy is that I used to get those letters, and I would tear them up and throw them away. I am not a great example of adapting or learning. Payers know that most providers do not change behavior, even if the change is easy.
With our AI system, there is absolute feedback—both automated and manual. It learns from user input, and we also run feedback sessions with teams to see what works, what does not, and where improvements are needed. There is a lot of that, especially at this stage of the game.
Kevin Pho: How common is AI for payers to evaluate billing claims? It seems like an intuitive next step.
Tim Wetherill: Payers are not the most sophisticated industry from a technology standpoint. They still use faxes and DOS-based systems that are not even supported by their original manufacturers anymore. That is not an exaggeration—some core payer systems date to the 1990s. They are trying, but sometimes the motivation is simply not there. Yes, the federal government passes rules, people complain, but until there is a big push—like a new CMS requirement—it will not happen overnight.
I do think there is the will and intention, but this industry moves incredibly slowly. That is unfortunate because these AI solutions are not terribly difficult to implement. Many of them are cloud-based and do not require deep integration with legacy systems. It is primarily a culture problem.
Kevin Pho: So as we speak in April 2025, would you say that just a minority of payers are using AI tools to adjudicate billing claims?
Tim Wetherill: Yes, it is definitely a minority. There is probably some so‑called AI that touches a lot of claims, but it is rudimentary: matching an ICD to a CPT code. You can call that AI, but it is really just a rules engine. True, significant improvements in quality output due to AI are absolutely still in the minority.
Kevin Pho: I am a primary care physician. After we write our notes, we typically just put our own code in. I can see a scenario where AI looks at my note and suggests a code based on what I documented, so I do not have to worry about it. Do you see that happening?
Tim Wetherill: Yes, I do. That is a big opportunity. The only concern, though—and this applies on both the provider and payer sides—is ensuring the AI does not have a bias or hidden motive like, “Maximize Kevin’s revenue.” I do not think that should be the rule. It should simply be: “Here is what you documented—here is the CPT code for that visit, that procedure, or that problem.”
We have seen a nefarious case in 2019 where the Department of Justice prosecuted an EHR company that had a kickback scheme with an opioid manufacturer. The software kept prompting doctors to prescribe more opioids, which is obviously terrible. We do not want that. We need to be aware of these possibilities. If you give me an AI tool, I want to know what its intent is, how it was designed, and if there is federal oversight. There is nearly zero federal oversight on the business side of AI—maybe some on the clinical side. That is what makes me nervous: It could accelerate the problems we already have.
Kevin Pho: So right now, there really are not a lot of safeguards both ways. It is almost a wild west if either side wants to manipulate AI to maximize revenue or drive denials.
Tim Wetherill: Yes. The only safeguard we have is the oath you, I, and others took—and our willingness to stand by it. Even though I am not in practice anymore, I still live by that oath. No matter how far removed we are, there is a powerful human interaction behind every claim. I refuse to make it worse. If anyone wanted me to do that, I would be gone.
Kevin Pho: On this podcast and on KevinMD, we often talk about how AI intersects various aspects of health care. Billing and coding is obviously one of those. Where do you see it going next?
Tim Wetherill: I see a huge opportunity for a neutral arbitrator that both payers and providers trust. It would reduce fighting, speed up payments, and do so accurately. Everybody would know what to expect. If we could get past the greed on both sides—because no one is blameless here—and move beyond this arms race in billing and coding, it could be great.
We have been in a decades-long arms race. It is almost like mutually assured destruction. A neutral arbiter—some platform we could all agree on—would solve a lot of problems. Technologically, that is not so hard. Culturally, it is harder.
Kevin Pho: A tangential issue: with AI possibly taking over the coding side of the chart, do you worry about coding itself as a profession?
Tim Wetherill: Not necessarily. Coders are essential because AI depends on rules, and coding can be extremely complex. I am still learning nuances from my coding colleagues. There are so many corner cases. It is partly about language, partly about interpreting the record. We need experts to teach these AI models and improve them.
What I really hope is that we eventually simplify coding. We keep making it more complicated. When DRGs started in the 1980s, they were supposed to simplify and control costs by bundling services. But literature suggests they have not controlled costs. They just created loopholes that some could exploit.
Kevin Pho: We are talking to Tim Wetherill, a physician executive. Today’s KevinMD article is “Why AI is a perfect neutral arbiter for health care claims.” Tim, let us end with some take‑home messages you would like to leave with the KevinMD audience.
Tim Wetherill: Sure. One of the biggest lessons I learned: I had been in private practice, I was employed, I worked at the VA, and I did some prior‑auth reviews for Blue Cross plans, so I thought I knew a lot. But once I got into the payer space, I realized I did not know much at all. It is incredible how complex and confusing this world can be.
Payers do some things that are actually quite good, but many physician voices are loud and uninformed. They look at denial letters or have opinions on care. Because they do not understand the other side, it generates chaos. Sometimes physicians help design legislation (like prior‑auth legislation) that looks good on paper but actually increases denial rates. We need people curious enough to learn both sides and then design solutions based on that understanding.
Kevin Pho: Tim, thank you so much for sharing your perspective and insight, and thanks again for coming on the show.
Tim Wetherill: Thanks, Kevin. I really appreciate it.