litellm - 💡(How to fix) Fix Race condition in /team/member_add — concurrent requests lose team members [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#25951Fetched 2026-04-18 05:52:55
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0

Root Cause

The team_member_add function in team_endpoints.py uses a read-modify-write pattern without any locking or transaction:

  1. READ: Fetches current members_with_roles from LiteLLM_TeamTable
  2. MODIFY: Appends new member to the list in memory
  3. WRITE: Overwrites the entire members_with_roles column with json.dumps()

When two requests execute concurrently:

  1. Request A reads members: [alice, bob]
  2. Request B reads members: [alice, bob] (same snapshot)
  3. Request A writes: [alice, bob, carol]
  4. Request B writes: [alice, bob, dave]carol is lost

The last write wins, silently dropping members added by concurrent requests.

Fix Action

Workaround

Use -parallelism=1 when calling via Terraform to serialize requests.

RAW_BUFFERClick to expand / collapse

Bug Description

POST /team/member_add has a race condition when multiple concurrent requests add members to the same team. Only 1-2 members get persisted per batch — the rest are silently lost.

Root Cause

The team_member_add function in team_endpoints.py uses a read-modify-write pattern without any locking or transaction:

  1. READ: Fetches current members_with_roles from LiteLLM_TeamTable
  2. MODIFY: Appends new member to the list in memory
  3. WRITE: Overwrites the entire members_with_roles column with json.dumps()

When two requests execute concurrently:

  1. Request A reads members: [alice, bob]
  2. Request B reads members: [alice, bob] (same snapshot)
  3. Request A writes: [alice, bob, carol]
  4. Request B writes: [alice, bob, dave]carol is lost

The last write wins, silently dropping members added by concurrent requests.

Steps to Reproduce

  1. Create a team
  2. Send 5+ POST /team/member_add requests concurrently (e.g., via Terraform with default parallelism)
  3. Check GET /team/info → only 1-2 of the 5 members are in members_with_roles
  4. The API returns 200 for all requests — no errors reported

Expected Behavior

All members should be persisted regardless of concurrency.

Workaround

Use -parallelism=1 when calling via Terraform to serialize requests.

Environment

  • LiteLLM version: v1.82.3
  • Database: PostgreSQL
  • Terraform provider: ncecere/litellm v1.2.3

Related

  • #8285 — Similar issue with /user/new not updating LiteLLM_TeamTable

extent analysis

TL;DR

Implementing a locking mechanism or transaction in the team_member_add function can resolve the race condition issue.

Guidance

  • Identify the critical section in the team_member_add function where the read-modify-write pattern is used and apply a locking mechanism to prevent concurrent access.
  • Consider using database transactions to ensure atomicity and consistency of the members_with_roles updates.
  • Review the team_endpoints.py file to ensure that the locking or transaction mechanism is properly implemented and does not introduce any new issues.
  • Test the updated team_member_add function with concurrent requests to verify that all members are persisted correctly.

Example

import threading

lock = threading.Lock()

def team_member_add(team_id, member):
    with lock:
        # READ: Fetch current members_with_roles from LiteLLM_TeamTable
        members = fetch_members(team_id)
        # MODIFY: Append new member to the list in memory
        members.append(member)
        # WRITE: Overwrite the entire members_with_roles column with json.dumps()
        update_members(team_id, members)

Notes

The provided workaround using -parallelism=1 with Terraform can mitigate the issue but may not be suitable for production environments where concurrency is expected. A proper locking or transaction mechanism is necessary to ensure data consistency.

Recommendation

Apply a locking mechanism or transaction to the team_member_add function to resolve the race condition issue, as it provides a more robust and scalable solution compared to the workaround.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING