Skip to main content

Kimsehun725's group workspace

Timestamps visible
2022-10-14 17:52:55
1.7 M     Total params
2022-10-14 17:52:55
6.873     Total estimated model params size (MB)
2022-10-14 17:52:55
Sanity Checking: 0it [00:00, ?it/s]
2022-10-14 17:52:59
GPU available: True (cuda), used: True
2022-10-14 17:52:59
TPU available: False, using: 0 TPU cores
2022-10-14 17:52:59
IPU available: False, using: 0 IPUs
2022-10-14 17:52:59
HPU available: False, using: 0 HPUs
2022-10-14 17:52:59
wandb: logging graph, to disable use `wandb.watch(log_graph=False)`
2022-10-14 17:52:59
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]
2022-10-14 17:52:59
  | Name                   | Type             | Params
2022-10-14 17:52:59
------------------------------------------------------------
2022-10-14 17:52:59
0 | conv_stack             | ConvStack        | 1.6 M
2022-10-14 17:52:59
1 | self_attention_block   | ConformerEncoder | 92.0 K
2022-10-14 17:52:59
2 | frame_tab_output_layer | Sequential       | 24.6 K
2022-10-14 17:52:59
3 | softmax_by_string      | Softmax          | 0
2022-10-14 17:52:59
------------------------------------------------------------
2022-10-14 17:52:59
1.7 M     Trainable params
2022-10-14 17:52:59
0         Non-trainable params
2022-10-14 17:52:59
1.7 M     Total params
2022-10-14 17:52:59
6.873     Total estimated model params size (MB)
2022-10-14 17:52:59
Training test number 02 ...
2022-10-14 17:52:59
Sanity Checking: 0it [00:00, ?it/s]
2022-10-14 17:53:01
Process Process-2854:
2022-10-14 17:53:01
Traceback (most recent call last):
2022-10-14 17:53:01
  File "/usr/lib/python3.8/multiprocessing/process.py", line 307, in _bootstrap
2022-10-14 17:53:01
    util._finalizer_registry.clear()
2022-10-14 17:53:01
KeyboardInterrupt
2022-10-14 17:53:01
Exception ignored in: <function _releaseLock at 0x2b8605c22280>
2022-10-14 17:53:01
Traceback (most recent call last):
2022-10-14 17:53:01
  File "/usr/lib/python3.8/logging/__init__.py", line 227, in _releaseLock
2022-10-14 17:53:01
    def _releaseLock():
2022-10-14 17:53:01
KeyboardInterrupt:
2022-10-14 17:53:07
Traceback (most recent call last):
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1163, in _try_get_data
2022-10-14 17:53:07
    data = self._data_queue.get(timeout=timeout)
2022-10-14 17:53:07
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 108, in get
2022-10-14 17:53:07
    raise Empty
2022-10-14 17:53:07
_queue.Empty
2022-10-14 17:53:07
The above exception was the direct cause of the following exception:
2022-10-14 17:53:07
Traceback (most recent call last):
2022-10-14 17:53:07
  File "src/train.py", line 65, in <module>
2022-10-14 17:53:07
    main(kwargs)
2022-10-14 17:53:07
  File "src/train.py", line 25, in main
2022-10-14 17:53:07
    step3(now, kwargs)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/src/step3.py", line 213, in step3
2022-10-14 17:53:07
    train(kwargs, use_pretrained_model, pretrained_time, pretrained_epoch, now, test_num, train_data_list,
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/src/step3.py", line 153, in train
2022-10-14 17:53:07
    trainer.fit(model, train_dataloaders=train_loader,
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
2022-10-14 17:53:07
    self._call_and_handle_interrupt(
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
2022-10-14 17:53:07
    return trainer_fn(*args, **kwargs)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
2022-10-14 17:53:07
    results = self._run(model, ckpt_path=self.ckpt_path)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
2022-10-14 17:53:07
    results = self._run_stage()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
2022-10-14 17:53:07
    return self._run_train()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1274, in _run_train
2022-10-14 17:53:07
    self._run_sanity_check()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1343, in _run_sanity_check
2022-10-14 17:53:07
    val_loop.run()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
2022-10-14 17:53:07
    self.advance(*args, **kwargs)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
2022-10-14 17:53:07
    dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
2022-10-14 17:53:07
    self.advance(*args, **kwargs)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 127, in advance
2022-10-14 17:53:07
    batch = next(data_fetcher)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 184, in __next__
2022-10-14 17:53:07
    return self.fetching_function()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 263, in fetching_function
2022-10-14 17:53:07
    self._fetch_next_batch(self.dataloader_iter)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 277, in _fetch_next_batch
2022-10-14 17:53:07
    batch = next(iterator)
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in __next__
2022-10-14 17:53:07
    data = self._next_data()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1359, in _next_data
2022-10-14 17:53:07
    idx, data = self._get_data()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1325, in _get_data
2022-10-14 17:53:07
    success, data = self._try_get_data()
2022-10-14 17:53:07
  File "/data/group1/z44543r/vae_separation/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1176, in _try_get_data
2022-10-14 17:53:07
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
2022-10-14 17:53:07
RuntimeError: DataLoader worker (pid(s) 124026, 124066, 124106, 124146, 124186, 124226, 124266, 124306, 124346, 124386, 124426, 124466, 124506, 124546, 124586, 124626, 124666, 124706, 124746, 124786, 124826, 124866, 124906, 124946, 124986, 125026, 125066, 125106, 125146, 125186, 125226, 125266, 125306, 125346, 125386, 125426) exited unexpectedly