A key addition to the new staging system is longitudinal strain, an echocardiogram parameter often criticized for inter-operator and inter-vendor variability. The study's strength lies in externally validating its -9% threshold across multiple centers in the US, UK, and Europe, proving it is a robust and reproducible predictor of poor outcomes in real-world settings.
A blinded central radiology review is not the absolute gold standard for assessing patient progression. Expert clinicians argue their holistic assessment, incorporating the patient's clinical status and other biomarkers alongside scans, provides critical context that a disconnected reviewer lacks.
The CREST trial showed benefit driven by patients with carcinoma in situ (CIS), while the Potomac trial showed a lack of benefit in the same subgroup. This stark inconsistency demonstrates that subgroup analyses, even for stratified factors, can be unreliable and are a weak basis for regulatory decisions or label restrictions.
The primary bottleneck in improving AI is no longer data or compute, but the creation of 'evals'—tests that measure a model's capabilities. These evals act as product requirement documents (PRDs) for researchers, defining what success looks like and guiding the training process.
The former high-risk group (Stage 3b) was traditionally excluded from major clinical trials. The new staging system demonstrates that these patients have better-than-expected outcomes with modern therapy and should be included in future studies. It simultaneously identifies a new ultra-high-risk group (Stage 3c) that requires entirely different trial designs.
When selecting foundational models, engineering teams often prioritize "taste" and predictable failure patterns over raw performance. A model that fails slightly more often but in a consistent, understandable way is more valuable and easier to build robust systems around than a top-performer with erratic, hard-to-debug errors.
The Rampart study's use of the Leibovic score for risk stratification is a key strength. Unlike traditional TNM staging, this score more heavily weights tumor grade, which clinicians find to be a more granular and clinically relevant predictor of recurrence risk than just tumor size.
Modernizing trials is less about new tools and more about adopting a risk-proportional mindset, as outlined in ICH E6(R3) guidelines. This involves focusing rigorous oversight on critical data and processes while applying lighter, more automated checks elsewhere, breaking the industry's habit of treating all data with the same level of manual scrutiny.
Experts believe the stark difference in complete response rates (5% vs 30%) between two major ADC trials is likely due to "noise"—variations in patient populations (e.g., more upper tract disease) and stricter central review criteria, rather than a fundamental difference in the therapies' effectiveness.
There are 12 million major diagnostic mistakes per year in the U.S., resulting in 800,000 deaths or disabilities. Cardiologist Eric Topol frames this as a massive, under-acknowledged systemic crisis that the medical community fails to adequately address, rather than a series of isolated incidents.
The successful KEYNOTE-564 trial intentionally used a pragmatic patient selection model based on universally available pathology data like TNM stage and grade. This approach avoids complex, inconsistently applied nomograms, ensuring broader real-world applicability and potentially smoother trial execution compared to studies relying on more niche scoring systems.