Skip to content

v0.8.0

Exciting New Features πŸŽ‰

  • Added impersonation of Spark Service Account through hooks. #1575
  • Write matrix outputs to BQ #1620
  • Cleanup script for raw data cleanup #1519
  • Implement versioning of transformers class in the integration pipeline #1551
  • Pipeline for each evaluation of matrix #1559
  • Change infra branches to main #1595
  • Improve model prediction performance with Spark Pandas UDFs #1540
  • Add drug and disease neighbours histograms #1613

Experiments πŸ§ͺ

  • Run experiment comparing different versions of ROBOKOP in a standalone and integrated KG #151 and #156 in Lab Notebooks Repo
  • Implement almost pure rank based frequent flyers matrix transformation in pipeline run here
  • Follow-up experiment from filtering_versions_robokop examining if filtering ROBOKOP in an integrated KG improves performance. Baseline RTX [notebook here] (https://github.com/everycure-org/lab-notebooks/blob/robokop-integrated-kg-experiment/cross-kg-modelling/robokop_integrated_versions.ipynb)
  • Follow-up experiment from filtering_versions_robokop examining if filtering ROBOKOP in an integrated KG improves performance. Unfiltered Robokop version notebook here
  • Follow-up experiment from filtering_versions_robokop examining if filtering ROBOKOP in an integrated KG improves performance. Filtered ROBOKOP version notebook here
  • Disease split experiment using TxGNN disease groups with negatives synthesised as per our typical pipeline (eg randomly) + XGB notebook here
  • Disease split experiment using TxGNN disease groups with new implementation of negative sampling to simulate a zero shot scenario + XGB notebook here
  • Run experiment comparing different versions of ROBOKOP in a standalone and integrated KG #151 and #156 in Lab Notebooks Repo

Bugfixes πŸ›

  • Change pathways for filtering pipeline #1567
  • Fix KG dashboard deploy action name in release action #1570
  • Remove KG Dashboard deployment action default release version #1603
  • Fixed broken css on evidence dashboard #1605
  • Fix KG dashboard deployment action environment variable #1614
  • Added missing permission for production GCP CloudBuild SA #1616
  • Fix KG Dashboard link in release PR #1624
  • Fix EC clinical trials transformers to use select_cols #1625
  • Update rtxkg2 transformer code #1566
  • Fix drug and disease ranks #1572

Technical Enhancements 🧰

  • Add uniform rank based FF transform #1550
  • Added tolerations to main pod so that large instances could be tolerated #1552
  • Allow sampling pipeline release_version parameter to be null #1599
  • Checked and Modified nodes that don't need GPU #1563
  • Grant Orchard Production Project to access Dev Bucket #1593
  • Change Initial Desired state and Idle Timeout for Workbench #1635
  • Infra into main #1583
  • Resolve Critical Vulnerabilities in Packages and their sub-dependencies as of 13th June 2025 #1588
  • Grant Orchard Production Project to access Dev Bucket #1593
  • Production Infra Branch Merge into Main branch #1596
  • Infrastructure/deploy main changes #1607
  • Add prod data release zone bucket & infra #1612
  • Added missing permission for production GCP CloudBuild SA #1616
  • Add KG Dashboard docker configuration #1609
  • [KG Dashboard] Refactor project-id to environment variable #1608
  • Refactor KG Dashboard deploy action parameters to environment #1610

Documentation ✏️

  • Expand Modelling pipeline documentation #1622
  • Refactor Pipeline documentation to individual sections #1611
  • Create LICENSE #1543

Other Changes

  • Do not refresh Credentials for GCP SA if it is a Github Action #1594
  • Removed orgPolicyAdmin from build #1617
  • Change permissions to read only for create-release-pr.yml github oidc #1578
  • Production Infra Branch Merge into Main branch #1596
  • Upgraded gunicorn to version 23.0.0 #1601
  • Infrastructure/deploy main changes #1607
  • Bump the npm_and_yarn group across 2 directories with 5 updates #1597
  • Bump the pip group across 1 directory with 3 updates #1598