Jeremy Reizenstein
9eeb456e82
Update license for company name
...
Summary: Update all FB license strings to the new format.
Reviewed By: patricklabatut
Differential Revision: D33403538
fbshipit-source-id: 97a4596c5c888f3c54f44456dc07e718a387a02c
2022-01-04 11:43:38 -08:00
Patrick Labatut
af93f34834
License lint codebase
...
Summary: License lint codebase
Reviewed By: theschnitz
Differential Revision: D29001799
fbshipit-source-id: 5c59869911785b0181b1663bbf430bc8b7fb2909
2021-06-22 03:45:27 -07:00
Nikhila Ravi
c3d636dc8c
Cuda updates
...
Summary:
Updates to:
- enable cuda kernel launches on any GPU (not just the default)
- cuda and contiguous checks for all kernels
- checks to ensure all tensors are on the same device
- error reporting in the cuda kernels
- cuda tests now run on a random device not just the default
Reviewed By: jcjohnson, gkioxari
Differential Revision: D21215280
fbshipit-source-id: 1bedc9fe6c35e9e920bdc4d78ed12865b1005519
2020-04-24 09:11:04 -07:00
Patrick Labatut
d57daa6f85
Address black + isort fbsource linter warnings
...
Summary: Address black + isort fbsource linter warnings from D20558374 (previous diff)
Reviewed By: nikhilaravi
Differential Revision: D20558373
fbshipit-source-id: d3607de4a01fb24c0d5269634563a7914bddf1c8
2020-03-29 14:51:02 -07:00
Patrick Labatut
3c71ab64cc
Remove shebang line when not strictly required
...
Summary: The shebang line `#!<path to interpreter>` is only required for Python scripts, so remove it on source files for class or function definitions. Additionally explicitly mark as executable the actual Python scripts in the codebase.
Reviewed By: nikhilaravi
Differential Revision: D20095778
fbshipit-source-id: d312599fba485e978a243292f88a180d71e1b55a
2020-03-12 10:39:44 -07:00
Georgia Gkioxari
60f3c4e7d2
cpp support for packed to padded
...
Summary:
Cpu implementation for packed to padded and added gradients
```
Benchmark Avg Time(μs) Peak Time(μs) Iterations
--------------------------------------------------------------------------------
PACKED_TO_PADDED_2_100_300_1_cpu 138 221 3625
PACKED_TO_PADDED_2_100_300_1_cuda:0 184 261 2716
PACKED_TO_PADDED_2_100_300_16_cpu 555 726 901
PACKED_TO_PADDED_2_100_300_16_cuda:0 179 260 2794
PACKED_TO_PADDED_2_100_3000_1_cpu 396 519 1262
PACKED_TO_PADDED_2_100_3000_1_cuda:0 181 274 2764
PACKED_TO_PADDED_2_100_3000_16_cpu 4517 5003 111
PACKED_TO_PADDED_2_100_3000_16_cuda:0 224 397 2235
PACKED_TO_PADDED_2_1000_300_1_cpu 138 212 3616
PACKED_TO_PADDED_2_1000_300_1_cuda:0 180 282 2775
PACKED_TO_PADDED_2_1000_300_16_cpu 565 711 885
PACKED_TO_PADDED_2_1000_300_16_cuda:0 179 264 2797
PACKED_TO_PADDED_2_1000_3000_1_cpu 389 494 1287
PACKED_TO_PADDED_2_1000_3000_1_cuda:0 180 271 2777
PACKED_TO_PADDED_2_1000_3000_16_cpu 4522 5170 111
PACKED_TO_PADDED_2_1000_3000_16_cuda:0 216 286 2313
PACKED_TO_PADDED_10_100_300_1_cpu 251 345 1995
PACKED_TO_PADDED_10_100_300_1_cuda:0 178 262 2806
PACKED_TO_PADDED_10_100_300_16_cpu 2354 2750 213
PACKED_TO_PADDED_10_100_300_16_cuda:0 178 291 2814
PACKED_TO_PADDED_10_100_3000_1_cpu 1519 1786 330
PACKED_TO_PADDED_10_100_3000_1_cuda:0 179 237 2791
PACKED_TO_PADDED_10_100_3000_16_cpu 24705 25879 21
PACKED_TO_PADDED_10_100_3000_16_cuda:0 228 316 2191
PACKED_TO_PADDED_10_1000_300_1_cpu 261 432 1919
PACKED_TO_PADDED_10_1000_300_1_cuda:0 181 261 2756
PACKED_TO_PADDED_10_1000_300_16_cpu 2349 2770 213
PACKED_TO_PADDED_10_1000_300_16_cuda:0 180 256 2782
PACKED_TO_PADDED_10_1000_3000_1_cpu 1613 1929 310
PACKED_TO_PADDED_10_1000_3000_1_cuda:0 183 253 2739
PACKED_TO_PADDED_10_1000_3000_16_cpu 22041 23653 23
PACKED_TO_PADDED_10_1000_3000_16_cuda:0 220 343 2270
PACKED_TO_PADDED_32_100_300_1_cpu 555 750 901
PACKED_TO_PADDED_32_100_300_1_cuda:0 188 282 2661
PACKED_TO_PADDED_32_100_300_16_cpu 7550 8131 67
PACKED_TO_PADDED_32_100_300_16_cuda:0 181 272 2770
PACKED_TO_PADDED_32_100_3000_1_cpu 4574 6327 110
PACKED_TO_PADDED_32_100_3000_1_cuda:0 173 254 2884
PACKED_TO_PADDED_32_100_3000_16_cpu 70366 72563 8
PACKED_TO_PADDED_32_100_3000_16_cuda:0 349 654 1433
PACKED_TO_PADDED_32_1000_300_1_cpu 612 728 818
PACKED_TO_PADDED_32_1000_300_1_cuda:0 189 295 2647
PACKED_TO_PADDED_32_1000_300_16_cpu 7699 8254 65
PACKED_TO_PADDED_32_1000_300_16_cuda:0 189 311 2646
PACKED_TO_PADDED_32_1000_3000_1_cpu 5105 5261 98
PACKED_TO_PADDED_32_1000_3000_1_cuda:0 191 260 2625
PACKED_TO_PADDED_32_1000_3000_16_cpu 87073 92708 6
PACKED_TO_PADDED_32_1000_3000_16_cuda:0 344 425 1455
--------------------------------------------------------------------------------
Benchmark Avg Time(μs) Peak Time(μs) Iterations
--------------------------------------------------------------------------------
PACKED_TO_PADDED_TORCH_2_100_300_1_cpu 492 627 1016
PACKED_TO_PADDED_TORCH_2_100_300_1_cuda:0 768 975 652
PACKED_TO_PADDED_TORCH_2_100_300_16_cpu 659 804 760
PACKED_TO_PADDED_TORCH_2_100_300_16_cuda:0 781 918 641
PACKED_TO_PADDED_TORCH_2_100_3000_1_cpu 624 734 802
PACKED_TO_PADDED_TORCH_2_100_3000_1_cuda:0 778 929 643
PACKED_TO_PADDED_TORCH_2_100_3000_16_cpu 2609 2850 192
PACKED_TO_PADDED_TORCH_2_100_3000_16_cuda:0 758 901 660
PACKED_TO_PADDED_TORCH_2_1000_300_1_cpu 467 612 1072
PACKED_TO_PADDED_TORCH_2_1000_300_1_cuda:0 772 905 648
PACKED_TO_PADDED_TORCH_2_1000_300_16_cpu 689 839 726
PACKED_TO_PADDED_TORCH_2_1000_300_16_cuda:0 789 1143 635
PACKED_TO_PADDED_TORCH_2_1000_3000_1_cpu 629 735 795
PACKED_TO_PADDED_TORCH_2_1000_3000_1_cuda:0 812 916 616
PACKED_TO_PADDED_TORCH_2_1000_3000_16_cpu 2716 3117 185
PACKED_TO_PADDED_TORCH_2_1000_3000_16_cuda:0 844 1288 593
PACKED_TO_PADDED_TORCH_10_100_300_1_cpu 2387 2557 210
PACKED_TO_PADDED_TORCH_10_100_300_1_cuda:0 4112 4993 122
PACKED_TO_PADDED_TORCH_10_100_300_16_cpu 3385 4254 148
PACKED_TO_PADDED_TORCH_10_100_300_16_cuda:0 3959 4902 127
PACKED_TO_PADDED_TORCH_10_100_3000_1_cpu 2918 3105 172
PACKED_TO_PADDED_TORCH_10_100_3000_1_cuda:0 4054 4450 124
PACKED_TO_PADDED_TORCH_10_100_3000_16_cpu 12748 13623 40
PACKED_TO_PADDED_TORCH_10_100_3000_16_cuda:0 4023 4395 125
PACKED_TO_PADDED_TORCH_10_1000_300_1_cpu 2258 2492 222
PACKED_TO_PADDED_TORCH_10_1000_300_1_cuda:0 3997 4312 126
PACKED_TO_PADDED_TORCH_10_1000_300_16_cpu 3404 3597 147
PACKED_TO_PADDED_TORCH_10_1000_300_16_cuda:0 3877 4227 129
PACKED_TO_PADDED_TORCH_10_1000_3000_1_cpu 2789 3054 180
PACKED_TO_PADDED_TORCH_10_1000_3000_1_cuda:0 3821 4402 131
PACKED_TO_PADDED_TORCH_10_1000_3000_16_cpu 11967 12963 42
PACKED_TO_PADDED_TORCH_10_1000_3000_16_cuda:0 3729 4290 135
PACKED_TO_PADDED_TORCH_32_100_300_1_cpu 6933 8152 73
PACKED_TO_PADDED_TORCH_32_100_300_1_cuda:0 11856 12287 43
PACKED_TO_PADDED_TORCH_32_100_300_16_cpu 9895 11205 51
PACKED_TO_PADDED_TORCH_32_100_300_16_cuda:0 12354 13596 41
PACKED_TO_PADDED_TORCH_32_100_3000_1_cpu 9516 10128 53
PACKED_TO_PADDED_TORCH_32_100_3000_1_cuda:0 12917 13597 39
PACKED_TO_PADDED_TORCH_32_100_3000_16_cpu 41209 43783 13
PACKED_TO_PADDED_TORCH_32_100_3000_16_cuda:0 12210 13288 41
PACKED_TO_PADDED_TORCH_32_1000_300_1_cpu 7179 7689 70
PACKED_TO_PADDED_TORCH_32_1000_300_1_cuda:0 11896 12381 43
PACKED_TO_PADDED_TORCH_32_1000_300_16_cpu 10127 15494 50
PACKED_TO_PADDED_TORCH_32_1000_300_16_cuda:0 12034 12817 42
PACKED_TO_PADDED_TORCH_32_1000_3000_1_cpu 8743 10251 58
PACKED_TO_PADDED_TORCH_32_1000_3000_1_cuda:0 12023 12908 42
PACKED_TO_PADDED_TORCH_32_1000_3000_16_cpu 39071 41777 13
PACKED_TO_PADDED_TORCH_32_1000_3000_16_cuda:0 11999 13690 42
--------------------------------------------------------------------------------
```
Reviewed By: bottler, nikhilaravi, jcjohnson
Differential Revision: D19870575
fbshipit-source-id: 23a2477b73373c411899633386c87ab034c3702a
2020-02-19 10:48:54 -08:00