[PR #128] feat: variable type tracking for boto3 clients and resources #254

Open
opened 2026-03-15 11:55:51 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/awslabs/iam-policy-autopilot/pull/128
Author: @adpaco-aws
Created: 1/29/2026
Status: 🔄 Open

Base: mainHead: variable-type-tracking


📝 Commits (4)

  • d8cbee0 feat: variable type tracking for boto3 clients and resources
  • dac2753 fix: fix clippy and dylint errors
  • a8472b2 Merge branch 'main' into variable-type-tracking
  • 6888140 fix: fix fmt

📊 Changes

5 files changed (+1792 additions, -6 deletions)

View changed files

📝 iam-policy-autopilot-policy-generation/src/extraction/python/disambiguation.rs (+17 -2)
📝 iam-policy-autopilot-policy-generation/src/extraction/python/extractor.rs (+102 -4)
📝 iam-policy-autopilot-policy-generation/src/extraction/python/mod.rs (+1 -0)
📝 iam-policy-autopilot-policy-generation/src/extraction/python/node_kinds.rs (+3 -0)
iam-policy-autopilot-policy-generation/src/extraction/python/variable_type_tracker.rs (+1669 -0)

📄 Description

Description of changes:

Summary

Adds variable type tracking to improve SDK method call extraction precision when boto3 clients and resources are passed across function boundaries. This PR also includes a fix to ensure variable tracking results are properly respected during disambiguation.

What's Tracked

  • Direct assignments: s3_client = boto3.client('s3'), dynamodb = boto3.resource('dynamodb')
  • Variable aliases: my_client = s3_client
  • Function parameters: Type inference from call sites (supports multiple types per parameter)
  • Resource-derived variables: table = dynamodb.Table('users'), bucket = s3.Bucket('name')
  • Python LEGB scoping: Function-local variables and parameters correctly shadow module-level

Key Features

  • Module and function-level tracking: Variables tracked in both scopes
  • SDK object kind inference: Distinguishes between Client, Resource, and ResourceCollection
  • Multiple service types per parameter: Handles Python's dynamic typing where the same function can be called with different service types
  • Integration with disambiguation: Disambiguation now respects variable tracking results

Example 1: Direct Client Assignment

import boto3

# Variable tracking captures this as 'acm'
acm_client = boto3.client('acm')

# ListCertificates exists in acm, iot, AND transfer (all with no required params)
response = acm_client.list_certificates()

Before this PR:

$ extract-sdk-calls example_perfect_ambiguity.py
[{"Name":"list_certificates","PossibleServices":["acm","iot","transfer"]}]

→ Generates 3 policy statements

After this PR:

$ extract-sdk-calls example_perfect_ambiguity.py
[{"Name":"list_certificates","PossibleServices":["acm"]}]

→ Generates 1 policy statement

Example 2: Function Parameter Tracking

import boto3

def list_all_certificates(client):
    # Without tracking: ambiguous (could be acm, iot, or transfer)
    # With tracking: resolves to acm based on the passed client
    response = client.list_certificates()
    return response.get('CertificateSummaryList', [])

acm_client = boto3.client('acm')
certificates = list_all_certificates(acm_client)

Before: ["acm","iot","transfer"]
After: ["acm"]

Testing

  • 23 unit tests covering all tracking patterns
  • Demo examples with before/after comparisons
  • All Python test source codes verified to compile
  • Tests organized into 6 logical suites

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/awslabs/iam-policy-autopilot/pull/128 **Author:** [@adpaco-aws](https://github.com/adpaco-aws) **Created:** 1/29/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `variable-type-tracking` --- ### 📝 Commits (4) - [`d8cbee0`](https://github.com/awslabs/iam-policy-autopilot/commit/d8cbee03cddc07dd770be21e2f57d0324b28edf6) feat: variable type tracking for boto3 clients and resources - [`dac2753`](https://github.com/awslabs/iam-policy-autopilot/commit/dac2753964e437bdc89fcda98a88caa853b6e4fb) fix: fix clippy and dylint errors - [`a8472b2`](https://github.com/awslabs/iam-policy-autopilot/commit/a8472b26a317648e45760633ef1f0545c15513be) Merge branch 'main' into variable-type-tracking - [`6888140`](https://github.com/awslabs/iam-policy-autopilot/commit/6888140a4a6b2b4fb3ae3419a324695e48834ee2) fix: fix fmt ### 📊 Changes **5 files changed** (+1792 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `iam-policy-autopilot-policy-generation/src/extraction/python/disambiguation.rs` (+17 -2) 📝 `iam-policy-autopilot-policy-generation/src/extraction/python/extractor.rs` (+102 -4) 📝 `iam-policy-autopilot-policy-generation/src/extraction/python/mod.rs` (+1 -0) 📝 `iam-policy-autopilot-policy-generation/src/extraction/python/node_kinds.rs` (+3 -0) ➕ `iam-policy-autopilot-policy-generation/src/extraction/python/variable_type_tracker.rs` (+1669 -0) </details> ### 📄 Description *Description of changes:* ## Summary Adds variable type tracking to improve SDK method call extraction precision when boto3 clients and resources are passed across function boundaries. This PR also includes a fix to ensure variable tracking results are properly respected during disambiguation. ## What's Tracked - **Direct assignments**: `s3_client = boto3.client('s3')`, `dynamodb = boto3.resource('dynamodb')` - **Variable aliases**: `my_client = s3_client` - **Function parameters**: Type inference from call sites (supports multiple types per parameter) - **Resource-derived variables**: `table = dynamodb.Table('users')`, `bucket = s3.Bucket('name')` - **Python LEGB scoping**: Function-local variables and parameters correctly shadow module-level ## Key Features - **Module and function-level tracking**: Variables tracked in both scopes - **SDK object kind inference**: Distinguishes between Client, Resource, and ResourceCollection - **Multiple service types per parameter**: Handles Python's dynamic typing where the same function can be called with different service types - **Integration with disambiguation**: Disambiguation now respects variable tracking results ### Example 1: Direct Client Assignment ```python import boto3 # Variable tracking captures this as 'acm' acm_client = boto3.client('acm') # ListCertificates exists in acm, iot, AND transfer (all with no required params) response = acm_client.list_certificates() ``` **Before this PR:** ```bash $ extract-sdk-calls example_perfect_ambiguity.py [{"Name":"list_certificates","PossibleServices":["acm","iot","transfer"]}] ``` → Generates 3 policy statements **After this PR:** ```bash $ extract-sdk-calls example_perfect_ambiguity.py [{"Name":"list_certificates","PossibleServices":["acm"]}] ``` → Generates 1 policy statement ### Example 2: Function Parameter Tracking ```python import boto3 def list_all_certificates(client): # Without tracking: ambiguous (could be acm, iot, or transfer) # With tracking: resolves to acm based on the passed client response = client.list_certificates() return response.get('CertificateSummaryList', []) acm_client = boto3.client('acm') certificates = list_all_certificates(acm_client) ``` **Before:** `["acm","iot","transfer"]` **After:** `["acm"]` ## Testing - **23 unit tests** covering all tracking patterns - **Demo examples** with before/after comparisons - All Python test source codes verified to compile - Tests organized into 6 logical suites By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/iam-policy-autopilot#254
No description provided.