• 0 Posts
  • 12 Comments
Joined 1 year ago
cake
Cake day: June 13th, 2023

help-circle
  • There literally are probably a dozen LLM models trained exclusively on or fined tuned on medical papers and other medical materials, specifically designed to do medical diagnosis. The already perform on pair or better than the average doctors in some tests. It’s already a thing. And they will get better. Will they replace doctors outright, probably not at least not for a while. But they certainly will be very helpful tools to help doctors make diagnosis and miss blind spots. I’d bet in 5-10 years it will be considered malpractice (i.e., below the standard of care) not to consult with a specialized LLM when making certain diagnosis.

    On the other hand, you make a very compelling argument of “nuh uh” so I guess I should take that into account.


  • It’s fine to be skeptical of AI medical diagnostics. But your response is as much of a knee jerk “AI bad” as you accused me of being biased toward “AI good”. At no point did you ever both to discuss or argue against any of the points I raised about the quality and usefulness of the cited study. Your response consisted entirely of 1) you sure as shit won’t trsut AI, 2) doctors aren’t afraid of AI cause they are so busy, 3) I am biased, 4) capitalism bad (ironic since I was mostly talking about an open-source model), 5) the study I cited is bad because its pre-print (unlike all the wonderful studies you cited).

    Since you don’t want to deal with the substance, and just want to talk about “AI bad, doctor good” and since you only respect published studies: In the US our wonderful human doctors cause serious medical harm through misdiagnosis in about 800,000 cases a year (https://qualitysafety.bmj.com/content/early/2023/08/07/bmjqs-2021-014130). Our wonderful human doctors routinely ignore female complaints of pain, making them less likely to receive diagnosis of adnominal pain (https://pubmed.ncbi.nlm.nih.gov/18439195/), less likely to receive treatment for knee pain (https://pubmed.ncbi.nlm.nih.gov/18332383/), more likely to be sent home by our human doctors after being misdiagnosed while suffering a heart attack (https://pubmed.ncbi.nlm.nih.gov/10770981/), and more likely to have missed diagnosis of strokes (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5361750/). So maybe let’s not pretend like humans are infallible.

    Healthcare diagnosis is something that one day could greatly be improved with the assistance of AI, which can be kept up to date with the latest studies, which can read and analyze a patient’s entire medical history and catch things a doctor might miss, and which can conduct statistical analysis in a way better than a doctor relying on their vague recollections from 30 years ago in medical school. An AI never has a bad day and doesn’t feel like dealing with patients, is never tired or hungover, will never dismiss a patients concerns because of some bias about the patient being a woman, or the wrong skin color, or because they sound dumb, or whatever else (yes AI can be biased, they learn it from us, but I’d argue its easier to train bias out of AI than it is to train it out of the GP in Alabama screaming about DEI while writing a donation check to Trump). Will AI be perfect, no. Will it be better than doctors, probably not for a while but maybe. But it can absolutely assist and lead to better diagnosis.

    And since you want to cry about capitalism, while defending one of the weirdest capitalistic structures (the healthcare industry). Maybe think about what it would mean for millions of people to be able to run an open source diagnostic tool on their phones to help determine if they need treatment, without having to be charged by a doctor 300 dollars for walking into the office just to be ignored and dismissed so the doctor can quickly move to the next patient that has health insurance so they can get paid. Hmm, maybe democratizing access to medical diagnostics and care might be anti-capitalist? Wild thought. No that can’t be right, we need a system with health insurance gatekeepers and doctors taking on patients based on whether they have the insurance or cash to get them that new beamer.


  • This is such an annoyingly useless study. 1) the cases they gave ChatGPT were specifically designed to be unusual and challenging, they are basically brain teasers for pediatrics, so all you’ve shown is that ChatGPT can’t diagnose rare cases, but we learn nothing about how it does on common cases. It’s also not clear that these questions had actual verifiable answers, as the article only mentions that the magazine they were taken from sometimes explains the answers.

    1. since these are magazine brain teasers, and not an actual scored test, we have no idea how ChatGPT’s score compares to human pediatricians. Maybe an 83% error rate is better than the average pediatrician score.

    2. why even do this test with a general purpose foundational model in the first place, when there are tons of domain specific medical models already available, many open source?

    3. the paper is paywalled, but there doesn’t seem to be any indication that the researchers used any prompting strategies. Just last month Microsoft released a paper showing gpt-4, using CoT and multi shot promoting, could get a 90% score on the medical license exam, surpassing the 86.5 score of the domain specific medpapm2 model.

    This paper just smacks of defensive doctors trying to dunk on ChatGPT. Give a multi purpose model super hard questions, no promoting advantage, and no way to compare it’s score against humans, and then just go “hur during chatbot is dumb.” I get it, doctors are terrified because specialized LLMs are very certain to take a big chunk of their work in the next five years, so anything they can do to muddy the water now and put some doubt in people’s minds is a little job protection.

    If they wanted to do something actually useful, give those same questions to a dozen human pediatricians, give the questions to gpt-4 with zero shot, gpt-4 with Microsoft’s promoting strategy, and medpalm2 or some other high performing domain specific models, and then compare the results. Oh why not throw in a model that can reference an external medical database for fun! I’d be very interested in those results.

    Edit to add: If you want to read an actually interesting study, try this one: https://arxiv.org/pdf/2305.09617.pdf from May 2023. “Med-PaLM 2 scored up to 86.5% on the MedQA dataset…We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility.” The average human score is about 60% for comparison. This is the domain specific LLM I mentioned above, which last month Microsoft got GPT-4 to beat just through better prompting strategies.

    Ugh this article and study is annoying.



  • I’ve used it just to access Bing Chat, which has become my go to AI chatbot for a couple of reasons: 1) you theoretically get access to gpt 4 without paying 20 dollars a month, 2) it cites it’s sources, and 3) it can create images via DALLE from within the chat (which is handy, you can chat with the AI to help you think of an image prompt, the just say “ok make an image based on that description”). Other then that, i use Firefox at home. At work our choices are chrome or edge, so I use edge because of bing chat and I kind of like the layout better. It feels like choosing between buying something from Amazon or Walmart, which terrible corporation do I hate more in a given moment.


  • I know literally nothing about computers and I’ve been daily driving Linux for well over a decade. I just use Ubuntu and I’ve been pretty much using all the default settings, apart from some customization here and there. There was a time years ago when I wanted to learn and tinker, but in reality I never learned to use the command line for more than running updates (I still sudo apt-get update cause it makes me feel like hackerman).

    My point is, Linux is super easy to just set up and run. If you want to learn more, there’s plenty of opportunities for that. But it’s not something to be intimidated by at all. A lot of the community is enthusiasts (who’ve I’ve found extremely helpful back when I used to have problems) so you’ll hear more jargon in these spaces. But I’m sure there are tons of others like me that use Linux just fine day to day without understanding a ton about computers.


  • You have it worse I promise, that sounds miserable. I’m in northern California, but what the other reply you got said is accurate here as well. Lows in the 60Fs (15C), maybe even the upper 50s, when it’s really bad lows are in the mid 70s. Most days I have a fan in the window to cool things off overnight and even if not it gets cool enough that the AC won’t work itself to death overnight. I get up early so open all the windows, fans everywhere, and I try to get my place down to 70f (21c), close it all up by 9am, then try to ride it out without ac until the lows drop again. Humidity is very low where I am too. This Sunday it’s now saying 105 (41c) for a high and 66 (19c) for a low if that gives you an idea.



  • I was kind of surprised, where I am those are pretty normal temperatures, not for weeks on end but it can hit like that for a few days in a row. We’re expecting higher temperatures this weekend.

    Many Iranian cities and towns have suffered from temperatures above 40 degrees Celsius (104 Degrees Fahrenheit) in recent days, while the oil-rich southwestern city of Ahvaz hit 50 degrees Celsius on Tuesday. [122F]

    The capital city of Tehran experienced temperatures of 39 degrees Celsius on Wednesday.

    I just checked and their nightly lows are in the high 80sF so that sucks for sure. That 122F high is bonkers though, that’s pushing death valley territory. But overall it’s not worse than Arizona has been going through for like more than a month, highs above 110 and lows in the 90s. Greece’s heatwave seems like it is about on par to what Iran is going through, and I don’t remember hearing about them shutting down the country, just limiting outdoor work and deliveries during peak heat hours.

    But like you said, A/C might be a difference maker. I don’t know what Iran’s climate control availability is like, and this article didn’t say.