PATRIC KKELLY
My name is Patrick Kelly, and I specialize in the responsible integration of artificial intelligence into medical diagnostics. With a background in biomedical engineering and clinical data science, my work is driven by a commitment to making AI tools not only powerful but also safe, explainable, and aligned with clinical needs. The rapid advancement of diagnostic AI models has opened new frontiers in radiology, pathology, and genomics—but these models often face challenges in interpretability, fairness, and trust. My goal is to help "tame" these technologies, ensuring they work for clinicians and patients, not just data.


My research focuses on developing interpretable AI systems for medical imaging and structured health records. I explore model transparency techniques such as saliency maps, counterfactual reasoning, and attention-based architectures to enable clinicians to understand the rationale behind algorithmic decisions. I also work on benchmarking model behavior under clinical uncertainty, adversarial data shifts, and demographic imbalance. By integrating AI ethics, human-centered design, and validation in real-world settings, I aim to build systems that earn trust and offer real clinical utility.
I envision a future in which AI acts as a collaborative partner to physicians—enhancing diagnostic accuracy, reducing cognitive workload, and supporting earlier disease detection—without replacing human judgment. To get there, we must go beyond accuracy metrics and focus on usability, bias mitigation, and regulatory readiness. My work contributes to shaping AI systems that support equitable care, improve access to diagnostics in underserved areas, and align with the values of transparency and accountability. Taming AI in medicine is not about limiting its potential, but about aligning it with patient-centered outcomes.
As a researcher, I am committed to advancing the science of AI reliability and interpretability in healthcare. I am currently working on developing multi-modal diagnostic models that combine imaging, clinical text, and genomic data, with a strong emphasis on explainable outputs and clinician feedback loops. Looking ahead, I plan to collaborate with hospitals, regulators, and AI developers to establish shared frameworks for safe deployment. My mission is to help build a medical AI ecosystem that clinicians can trust, patients can benefit from, and health systems can scale responsibly.


Domain-Specific Alignmen: Medical “taming” requires specialized clinical terminology, guidelines, and multimodal inputs. GPT-3.5 lacks pretraining on endoscopy, pathology, and lab-report corpora, compromising medical semantic accuracy.
Complex Chain-of-Thought: The model must generate multi-step, audit-ready reasoning chains; GPT-3.5 often omits steps or breaks chains in long contexts.
Confidence Calibration & Abstention: High-precision calibration and threshold-based abstention are needed. GPT-3.5’s limited parameters and context window yield higher calibration error and false abstention.
Dual-Task Stability: Simultaneous classification (diagnosis) and regression (risk probability) optimization causes catastrophic forgetting in GPT-3.5.
GPT-4’s larger capacity and extended context, when fine-tuned, can reliably embed medical ontologies, calibrate confidences, maintain reasoning consistency, and reduce hallucinations by ≥40%, ensuring clinical safety. It is the only viable path for real-world medical AI “taming.”



