-
Notifications
You must be signed in to change notification settings - Fork 576
[PyTorch debug] Fix test for debug tools #2507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/te-ci pytorch L1 |
Greptile OverviewGreptile SummarySkips Changes:
Confidence Score: 5/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Test as Test Runner
participant Env as Environment Check
participant Config as Test Configuration
participant Skip as Skip Logic
participant Execute as Test Execution
Test->>Env: Check NVTE_TEST_NVINSPECT_ENABLED
Env->>Env: Convert env var to int (0 or 1)
Env->>Config: Store as global variable
alt Debug mode enabled
Env->>Config: Initialize nvdlfw_inspect.api
Config->>Config: Load debug config & features
end
Test->>Config: Iterate over test kwargs
Config->>Skip: Check delay_wgrad_compute flag
alt delay_wgrad_compute=True AND debug enabled
Skip->>Test: Skip test (continue to next)
Note over Skip,Test: Delayed wgrad is incompatible<br/>with debug mode
else
Config->>Execute: Run test with kwargs
Execute->>Test: Return test results
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (3)
-
tests/pytorch/distributed/run_numerics.py, line 745 (link)logic: Inconsistent env var handling -
os.environ.get()returns strings, not booleans. Other test files useint(os.environ.get("NVTE_TEST_NVINSPECT_ENABLED", "0"))(seetests/pytorch/test_numerics.py:106). While this works when set to"True", it would incorrectly evaluate to true if someone setsNVTE_TEST_NVINSPECT_ENABLED="0"or"False". -
tests/pytorch/distributed/run_numerics.py, line 929 (link)logic: Same env var handling issue - should use
int(os.environ.get("NVTE_TEST_NVINSPECT_ENABLED", "0"))for consistency withtest_numerics.py. -
tests/pytorch/distributed/run_numerics.py, line 1042 (link)logic: Same env var handling issue - should use
int(os.environ.get("NVTE_TEST_NVINSPECT_ENABLED", "0"))for consistency withtest_numerics.py.
1 file reviewed, 3 comments
…nabled Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
c4e695a to
7b88506
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, no comments
Description
Skip L1 tests which should not be run with debug tools.
Fixes # (issue)
Type of change
Checklist: