You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please, which NCCL_DEBUG setting needs to be configured in order for the information identified by NCCL_GRAPH to be printed to the console or a specified file?
eg:
INFO(NCCL_GRAPH, "init.cc 1 Ring 2-1 LM test %02d : %d -> %d -> %d", c, comm->channels[c].ring.prev, comm->rank, comm->channels[c].ring.next); can't work when run "mpirun -np 4 -x NCCL_ALGO=ring -x NCCL_DEBUG=TRACE all_reduce_perf -b 8G -e 8G -f 2 -g 1 2>&1 |tee output_file.txt"
The text was updated successfully, but these errors were encountered:
Hi lmhahatest, In general, to see any INFO log you need to have "NCCL_DEBUG=INFO" or more (i.e., ABORT or TRACE will also work), and then also indicate "NCCL_DEBUG_SUBSYS=GRAPH" to get those settings. If you do not set NCCL_DEBUG_SUBSYS explicitly then you will get the default, which in v2.23 is INIT, BOOTSTRAP, and ENV as per the default for ncclDebugMask in src/debug.cc.
Please, which NCCL_DEBUG setting needs to be configured in order for the information identified by NCCL_GRAPH to be printed to the console or a specified file?
eg:
INFO(NCCL_GRAPH, "init.cc 1 Ring 2-1 LM test %02d : %d -> %d -> %d", c, comm->channels[c].ring.prev, comm->rank, comm->channels[c].ring.next); can't work when run "mpirun -np 4 -x NCCL_ALGO=ring -x NCCL_DEBUG=TRACE all_reduce_perf -b 8G -e 8G -f 2 -g 1 2>&1 |tee output_file.txt"
The text was updated successfully, but these errors were encountered: