dify - 💡(How to fix) Fix [Refactor/Chore] Optimize free plan workflow run cleanup batching [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The free plan expired workflow run cleanup currently scans expired workflow runs by time range, loads full WorkflowRun ORM objects, checks the billing plan for tenants found in each batch, and then deletes only runs owned by eligible free-plan tenants.

I would like to refactor this flow so the cleanup job has an explicit lightweight pipeline:

  1. Fetch candidate workflow run cleanup references only, such as id, tenant_id, and created_at, instead of full WorkflowRun rows.
  2. Extract candidate tenant IDs and use billing information to determine eligible free-plan tenants, preserving the existing sandbox plan, grace period, and cleanup whitelist behavior.
  3. Push tenant_ids=eligible_free_tenants down into the repository query so the target batch only includes free-plan tenants.
  4. Add repository methods for cleanup-specific lightweight refs, for example WorkflowRunCleanupRef(id, tenant_id, created_at), and count/delete related records by workflow run IDs.
  5. Keep deletion and dry-run counting based on run IDs rather than full WorkflowRun models.

Root Cause

The free plan expired workflow run cleanup currently scans expired workflow runs by time range, loads full WorkflowRun ORM objects, checks the billing plan for tenants found in each batch, and then deletes only runs owned by eligible free-plan tenants.

I would like to refactor this flow so the cleanup job has an explicit lightweight pipeline:

  1. Fetch candidate workflow run cleanup references only, such as id, tenant_id, and created_at, instead of full WorkflowRun rows.
  2. Extract candidate tenant IDs and use billing information to determine eligible free-plan tenants, preserving the existing sandbox plan, grace period, and cleanup whitelist behavior.
  3. Push tenant_ids=eligible_free_tenants down into the repository query so the target batch only includes free-plan tenants.
  4. Add repository methods for cleanup-specific lightweight refs, for example WorkflowRunCleanupRef(id, tenant_id, created_at), and count/delete related records by workflow run IDs.
  5. Keep deletion and dry-run counting based on run IDs rather than full WorkflowRun models.

Fix Action

Fixed

RAW_BUFFERClick to expand / collapse

Self Checks

  • I have read the Contributing Guide and Language Policy.
  • This is only for refactors or chores; if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report, otherwise it will be closed.
  • 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
  • Please do not modify this template :) and fill in all the required fields.

Description

The free plan expired workflow run cleanup currently scans expired workflow runs by time range, loads full WorkflowRun ORM objects, checks the billing plan for tenants found in each batch, and then deletes only runs owned by eligible free-plan tenants.

I would like to refactor this flow so the cleanup job has an explicit lightweight pipeline:

  1. Fetch candidate workflow run cleanup references only, such as id, tenant_id, and created_at, instead of full WorkflowRun rows.
  2. Extract candidate tenant IDs and use billing information to determine eligible free-plan tenants, preserving the existing sandbox plan, grace period, and cleanup whitelist behavior.
  3. Push tenant_ids=eligible_free_tenants down into the repository query so the target batch only includes free-plan tenants.
  4. Add repository methods for cleanup-specific lightweight refs, for example WorkflowRunCleanupRef(id, tenant_id, created_at), and count/delete related records by workflow run IDs.
  5. Keep deletion and dry-run counting based on run IDs rather than full WorkflowRun models.

Motivation

This should reduce memory and database load for large cleanup windows, especially when many scanned workflow runs belong to paid or otherwise ineligible tenants. It also makes the cleanup logic easier to reason about by separating candidate discovery, billing eligibility, and repository-level target selection.

Additional Context

The intended scope is the backend retention cleanup around clear_free_plan_expired_workflow_run_logs.py and the API workflow run repository implementation. The behavior of the existing CLI options, retention windows, dry-run output, metrics, billing grace period, and whitelist checks should stay the same.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING