Content-Length: 378943 | pFad | http://github.com/NVIDIA/TransformerEngine/pulls

C3 Pull requests · NVIDIA/TransformerEngine · GitHub
Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Manage dependencies and add missing einops req
#1859 opened Jun 7, 2025 by ksivaman Loading…
7 of 13 tasks
[PyTorch] Add support for FP8 current scaling in operation-based API enhancement New feature or request testing Improvements to tests or testing infrastructure
#1858 opened Jun 6, 2025 by timmoon10 Loading…
6 of 14 tasks
Use public API instead of removed private function in te_llama.py
#1856 opened Jun 6, 2025 by janekb04 Loading…
2 of 13 tasks
[JAX] GEMM custom op 2.5.0
#1855 opened Jun 6, 2025 by denera Loading…
6 of 13 tasks
pyproject.toml
#1852 opened Jun 5, 2025 by ksivaman Draft
4 of 13 tasks
Draft: Add support for overlapping wgrad NCCL AG with dgrad GEMM
#1849 opened Jun 4, 2025 by djns99 Loading…
4 of 13 tasks
[PyTorch] Inference mode disables initializing quantized weights with column-wise usage 2.5.0 bug Something isn't working enhancement New feature or request
#1847 opened Jun 4, 2025 by timmoon10 Loading…
6 of 13 tasks
[JAX] TensorUsage + FP8 GEMM with all layouts handling on BW 2.5.0
#1844 opened Jun 3, 2025 by phu0ngng Loading…
8 of 13 tasks
[PyTorch Debug] Fixed the empty tensor bug in statistics computation
#1843 opened Jun 3, 2025 by pggPL Loading…
8 of 13 tasks
TE Gemma tutorial attempt#2
#1839 opened Jun 2, 2025 by sudhakarsingh27 Draft
1 task done
Make quantize_ respect the usages of the quantizer
#1836 opened May 31, 2025 by ptrendx Loading…
13 tasks
[PyTorch] Use FP16 tols for distributed tests with TF32 compute
#1831 opened May 28, 2025 by timmoon10 Loading…
6 of 13 tasks
Add cuBLASMp-backed GEMM-like API to TE common
#1824 opened May 27, 2025 by mk-61 Loading…
4 of 13 tasks
FP8 Param support for offloading
#1823 opened May 27, 2025 by sanandaraj5597 Loading…
Add support for head_dim > 128 2.5.0
#1797 opened May 18, 2025 by cyanguwa Loading…
9 of 13 tasks
[PyTorch][MoE] Reduce CPU Overhead By Fuse Torch Empty Calls performance Performance issues
#1793 opened May 16, 2025 by zhongbozhu Loading…
1 of 13 tasks
[common] Added support of FP4 data type
#1779 opened May 13, 2025 by Oleg-Goncharov Loading…
6 of 13 tasks
[PyTorch] Update PyTorch FSDP2 test to cover all TE layer types testing Improvements to tests or testing infrastructure
#1777 opened May 12, 2025 by denera Loading…
8 of 13 tasks
[PyTorch] Draft of new activation offloading API
#1762 opened May 8, 2025 by pggPL Draft
13 tasks
cache sequence chunk ids for reordering
#1757 opened May 7, 2025 by xrennvidia Draft
13 tasks
Zr te doc edits
#1745 opened May 2, 2025 by zredeaux07 Loading…
12 tasks
ProTip! Adding no:label will show everything without a label.








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/NVIDIA/TransformerEngine/pulls

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy