pytorch Spawning 子執行緒

spawning 子執行緒

僅支援 python >= 3.4.

依賴於spawn啟動方法(在 python 的multiprocessing包中)。

通過建立程序例項並呼叫join來等待它們完成，可以生成大量子程序來執行某些功能。這種方法在處理單個子程序時工作得很好，但在處理多個程序時可能會出現問題。

也就是說，順序連線程序意味著它們將順序終止。如果沒有，並且第乙個程序沒有終止，那麼程序終止將不被注意。此外，沒有用於錯誤傳播的本地工具.

下面的spawn函式解決了這些問題，並負責錯誤傳播、無序終止，並在檢測到其中乙個錯誤時主動終止程序.

torch.multiprocessing.spawn(fn, args=(), nprocs=1, join=true, daemon=false)

spawnsnprocs程序執行fn使用引數args.

如果其中乙個程序以非零退出狀態退出，則會殺死其餘程序，並引發異常，導致終止。在子程序中捕獲異常的情況下，將**該異常，並將其跟蹤包含在父程序中引發的異常中。

引數:

class torch.multiprocessing.spawncontext

由spawn()返回, 當join=false.

join(timeout=none)

嘗試連線此派生上下文中的乙個或多個程序。如果其中乙個程序以非零退出狀態退出，則此函式將殺死其餘程序，並引發異常，導致第乙個程序退出。

返回true如果所有程序正常退出,false如果有更多的程序需要 join.

使用例子：

參考：

import utils.multiprocessing as mpu
if cfg.num_gpus > 1:
torch.multiprocessing.spawn(
mpu.run,
nprocs=cfg.num_gpus,
args=(
cfg.num_gpus,
train,
cfg.dist_init_method,
cfg.shard_id,
cfg.num_shards,
cfg.dist_backend,
cfg,
),daemon=false,
)

"""multiprocessing helpers."""

import torch

def run(

local_rank, num_proc, func, init_method, shard_id, num_shards, backend, cfg

): """

runs a function from a child process.

args:

local_rank (int): rank of the current process on the current machine.

num_proc (int): number of processes per machine.

func (function): function to execute on each of the process.

init_method (string): method to initialize the distributed training.

tcp initialization: equiring a network address reachable from all

processes followed by the port.

shared file-system initialization: makes use of a file system that

is shared and visible from all machines. the url should start with

file:// and contain a path to a non-existent file on a shared file

system.

shard_id (int): the rank of the current machine.

num_shards (int): number of overall machines for the distributed

training job.

backend (string): three distributed backends ('nccl', 'gloo', 'mpi') are

supports, each with different capabilities. details can be found

here:

cfg (cfgnode): configs. details can be found in

slowfast/config/defaults.py

"""# initialize the process group.

world_size = num_proc * num_shards

rank = shard_id * num_proc + local_rank

try:

torch.distributed.init_process_group(

backend=backend,

init_method=init_method,

world_size=world_size,

rank=rank,

)except exception as e:

raise e

torch.cuda.set_device(local_rank)

func(cfg)

執行緒池以及子線層運用

執行緒池管理 public class threadpoolmanager 懶漢式是加同步鎖餓漢式執行緒安全 private static threadpoolmanager instance new threadpoolmanager private poolproxy longpoolpr...

線程序所持有資源以及子線程序所繼承資源

1.執行緒和程序的關係和棧 2 執行緒和程序間的比較子程序繼承父程序的屬性子執行緒繼承主線程的屬性實際使用者id，實際組id，有效使用者id，有效組id 附加組id 程序組id 會話id 控制終端設定使用者id標誌和設定組id標誌當前工作目錄根目錄檔案模式建立遮蔽字 umask 訊號...

對於線線問題

以下是乙個大佬的總結 authorlcy註明出處，摘自 1 n條直線最多分平面問題題目大致如 n條直線，最多可以把平面分為多少個區域。析可能你以前就見過這題目，這充其量是一道初中的思考題。但乙個型別的題目還是從簡單的入手，才容易發現規律。當有n 1條直線時，平面最多被分成了f n 1 個區域。則...

pytorch Spawning 子執行緒

執行緒池以及子線層運用

線 程序所持有資源以及子線 程序所繼承資源

對於線線問題

相關推薦

線程序所持有資源以及子線程序所繼承資源