feat(cli): Add support for `delay_login_until_ready` by mafredri · Pull Request #5851 · coder/coder

mafredri · 2023-01-25T14:16:17Z

This PR adds support to ssh and coder ssh for delay_login_until_ready.

Here are examples of when coder_agent.delay_login_until_ready = true (for false, behavior is as before, sans formatting changes):

❯ coder ssh test

 > The workspace agent is taking longer than expected to
   start. See troubleshooting instructions at:
   https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates

⢎⡱ Waiting for  main  to finish starting up...

On 30+s start, start_timeout => start_error:

❯ coder ssh test

 > Don't panic, your workspace is starting up!

 > The workspace agent is taking longer than expected to
   start. See troubleshooting instructions at:
   https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates

 > The workspace agent ran into a problem during startup.
   See troubleshooting instructions at:
   https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates

Agent startup script exited with non-zero status, use --no-wait to login anyway.
Run 'coder ssh --help' for usage.

~~Ultimately in this case, the user can log in with coder ssh --skip-delay-login-until-ready.~~

~~We might want to change behavior of start_error to exit with non-zero status instead blocking.~~

Ref: #5749

mafredri · 2023-01-26T09:56:12Z

 	cliflag.BoolVarP(cmd.Flags(), &forwardGPG, "forward-gpg", "G", "CODER_SSH_FORWARD_GPG", false, "Specifies whether to forward the GPG agent. Unsupported on Windows workspaces, but supports all clients. Requires gnupg (gpg, gpgconf) on both the client and workspace. The GPG agent must already be running locally and will not be started for you. If a GPG agent is already running in the workspace, it will be attempted to be killed.")
 	cliflag.StringVarP(cmd.Flags(), &identityAgent, "identity-agent", "", "CODER_SSH_IDENTITY_AGENT", "", "Specifies which identity agent to use (overrides $SSH_AUTH_SOCK), forward agent must also be enabled")
 	cliflag.DurationVarP(cmd.Flags(), &wsPollInterval, "workspace-poll-interval", "", "CODER_WORKSPACE_POLL_INTERVAL", workspacePollInterval, "Specifies how often to poll for workspace automated shutdown.")
+	cliflag.BoolVarP(cmd.Flags(), &skipDelayLoginUntilReady, "skip-delay-login-until-ready", "", "CODER_SSH_SKIP_DELAY_LOGIN_UNTIL_READY", false, "Specifies whether to login to a workspace that has not finished starting up (only applicable when the delay login until ready option is enabled).")


Should we go with something simpler, shorter here? Maybe --no-delay-login, --no-wait. I'd like to leave room for the opposite flag as well --[no-]wait to allow waiting even when the template doesn't specify it.

--no-wait is succinct 👍

mafredri · 2023-01-26T09:58:52Z

+					}
+				case codersdk.WorkspaceAgentLifecycleReady:
+					return nil
+				default:


Yay or nay?

Suggested change

default:

case codersdk.WorkspaceAgentLifecycleStartError:

showMessage()

return errors.New("...")

default:

case case?

I think we should fail-open here. Let folks try to connect, but warn that it might not work.

Yup, I agree. Let's make it so!

Done, this is what it looks like now:

❯ coder ssh test > The workspace agent ran into a problem during startup. See troubleshooting instructions at: https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates Agent startup script exited with non-zero status, use --no-wait to login anyway. Run 'coder ssh --help' for usage.

And with --no-wait we still warn about the problem:

❯ coder ssh test --no-wait > The workspace agent ran into a problem during startup. See troubleshooting instructions at: https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates mafredri@test ~ ❯

johnstcn · 2023-01-26T10:30:28Z

+
+func waitingMessage(agent codersdk.WorkspaceAgent) (m *message) {
+	m = &message{
+		Prompt: "Don't panic, your workspace is booting up!",


Can we add directions on how to bypass this e.g.

To skip this check, use the `--no-wait` argument

(Or whatever we end up calling it)

Not for this message (this state is before agent is connected so we can't connect yet).

We could add it for the other message, but I wonder if it would encourage bypassing the prompt, esp. without understanding what the flag does. You just go Ctrl+C -> --no-wait yay I'm in (and not everything is working). Would it suffice to explain this in the troubleshooting link?

I think this need will also be alleviated by streaming the startup log once we have it available.

Would it suffice to explain this in the troubleshooting link?

That's fine either

johnstcn

I think this is definitely a good step in troubleshooting failed workspace builds.

One possible avenue of future work I see is adding WorkspaceAgentDownloading / WorkspaceAgentDownloaded states. We could add a random pre-shared nonce (even the build ID could work here) to the workspace agent downloaded URL and use that to determine if the workspace attempted to download the agent.

mafredri · 2023-01-26T13:52:59Z

Made a recording to see this in action:

Refs: #5749, #5851

mafredri · 2023-01-27T11:19:54Z

Updated messaging:

⠈⠱ Waiting for  main  to become ready...


 > Don't panic, your workspace agent has connected and
   the workspace is getting ready!

 > The workspace is taking longer than expected to get
   ready, the agent startup script is still executing.
   See troubleshooting instructions at:
   https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates

 > The workspace ran into a problem while getting ready,
   the agent startup script exited with non-zero status.
   See troubleshooting instructions at:
   https://coder.com/docs/coder-oss/latest/templates#troubleshooting-templates

Agent startup script exited with non-zero status, use --no-wait to login anyway.
Run 'coder ssh --help' for usage.

❯ coder ssh test --no-wait

 > Your workspace is still getting ready, it may be in an
   incomplete state.

mafredri@test ~
❯

bpmct · 2023-01-27T14:58:20Z

+			m.Prompt = "The workspace is taking longer than expected to get ready, the agent startup script is still executing."
+		case codersdk.WorkspaceAgentLifecycleStartError:
+			m.Spin = ""
+			m.Prompt = "The workspace ran into a problem while getting ready, the agent startup script exited with non-zero status."


I like this copy! Once we start streaming build logs, I imagine we can also link to view the logs + the troubleshooting URL?

Yes definitely! We can also stream the log in the terminal while the user is waiting, like 5 latest rows (keeps updating). Maybe that'd enabled/disabled via flag.

That's even better!

feat(cli): Add support for delay_login_until_ready

7004c5d

Ref: #5749

mafredri self-assigned this Jan 25, 2023

mafredri mentioned this pull request Jan 25, 2023

Change agent startup script behavior from being never-ending to indicating the workspace is ready on end #5749

Closed

13 tasks

mafredri marked this pull request as ready for review January 25, 2023 14:26

mafredri requested review from bpmct and kylecarbs January 25, 2023 14:26

Run go mod tidy

fd4a596

mafredri commented Jan 26, 2023

View reviewed changes

johnstcn reviewed Jan 26, 2023

View reviewed changes

Rename flag to --no-wait

b7a69a6

johnstcn approved these changes Jan 26, 2023

View reviewed changes

mafredri added 2 commits January 26, 2023 11:20

Exit on startup error

4f46181

Show error message when using --no-wait

489ed6a

mafredri mentioned this pull request Jan 26, 2023

docs: Document agent readiness issues (startup script) #5877

Merged

mafredri added a commit that referenced this pull request Jan 26, 2023

docs: Document agent startup (script) issues

39b78d7

Refs: #5749, #5851

mafredri added 2 commits January 27, 2023 11:06

Cleanup and reword messeages

8ded3c8

fixup! Cleanup and reword messeages

cdbf70c

mafredri added 3 commits January 27, 2023 11:27

test: Fix ShowTroubleshootingURLAfterTimeout

00ab3ba

Merge branch 'main' into mafredri/feat-delay-login-until-ready

e9552c2

Improve --no-wait flag description

68d4c4a

bpmct reviewed Jan 27, 2023

View reviewed changes

bpmct approved these changes Jan 27, 2023

View reviewed changes

kylecarbs approved these changes Jan 27, 2023

View reviewed changes

Merge branch 'main' into mafredri/feat-delay-login-until-ready

b441de2

mafredri merged commit a753703 into main Jan 27, 2023

mafredri deleted the mafredri/feat-delay-login-until-ready branch January 27, 2023 17:05

github-actions Bot locked and limited conversation to collaborators Jan 27, 2023

Conversation

mafredri commented Jan 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mafredri Jan 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johnstcn Jan 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johnstcn left a comment

Choose a reason for hiding this comment

Uh oh!

mafredri commented Jan 26, 2023

Uh oh!

mafredri commented Jan 27, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mafredri commented Jan 25, 2023 •

edited

Loading

mafredri Jan 26, 2023 •

edited

Loading

johnstcn Jan 26, 2023 •

edited

Loading