RL not enough values to unpack (expected 5, got 4)

Question

question

janders1 asked Mar 19, '24 janders1 answered Mar 19, '24

RL not enough values to unpack (expected 5, got 4)

While doing the RL tutorial here, I am getting the following error. I am guessing the step function should be returning "observation, reward, done, info", but it is trying to unpack an extra arg called "truncated".

Waiting for input to close FlexSim...& "C:/Program Files/Python311/python.exe" "//fs-caedm.et.byu.edu/homes/.caedm/My Documents/FlexSim 2024 Projects/RL Tutorial/flexsim_training.py"
Traceback (most recent call last):
  File "\\fs-caedm.et.byu.edu\homes\.caedm\My Documents\FlexSim 2024 Projects\RL Tutorial\flexsim_training.py", line 60, in <module>        
    main()
  File "\\fs-caedm.et.byu.edu\homes\.caedm\My Documents\FlexSim 2024 Projects\RL Tutorial\flexsim_training.py", line 29, in main
    model.learn(total_timesteps=10)
  File "C:\Program Files\Python311\Lib\site-packages\stable_baselines3\ppo\ppo.py", line 315, in learn
    return super().learn(
           ^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 277, in learn
    continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 194, in collect_rollouts        
    new_obs, rewards, dones, infos = env.step(clipped_actions)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\stable_baselines3\common\vec_env\base_vec_env.py", line 206, in step
    return self.step_wait()
           ^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 58, in step_wait
    obs, self.buf_rews[env_idx], terminated, truncated, self.buf_infos[env_idx] = self.envs[env_idx].step(
                                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\stable_baselines3\common\monitor.py", line 94, in step
    observation, reward, terminated, truncated, info = self.env.step(action)

Software Version:

FlexSim 24.1.0

reinforcement training

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Answer 1 · 2024-03-19T22:24:07Z

janders1 answered Mar 19, '24

Fixed it. In flexsim_env.py, last line of step function does not have "truncated". Add "truncated" before "info", and it works.

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

question

RL not enough values to unpack (expected 5, got 4)

1 Answer

Things to know…

question details