AIChat Structured Output Request on Initial Failed Request#65569
Conversation
d7962d5 to
a86c6ea
Compare
🖼️ Storybook Visual Comparison ReportA difference was found in our Storybook front-end visual comparison testing against the Remediation steps:
|
…put processing outside of retry loop in safety_helper
91e9a53 to
0afcdc9
Compare
bencodeorg
left a comment
There was a problem hiding this comment.
A few smaller comments, but main question is about using "function calling" vs. specifying a response format like we do in our existing usage of structured outputs.
Per this description, it sounds like our use case would probably be a candidate for using response format?
https://platform.openai.com/docs/guides/structured-outputs#function-calling-vs-response-format
Happy to chat synchronously and/or pair on this as well if helpful!
| content: text | ||
| } | ||
| ], | ||
| tools: [ |
There was a problem hiding this comment.
Readability nit: maybe break out these beefier objects into variable declarations before using them when calling request_structured_chat_completion?
…st case to reflect new structured response fallback
bencodeorg
left a comment
There was a problem hiding this comment.
A few smaller comments, but overall, I pulled this and it worked great! 🎉
One general note is that I do think this could be simplified a bit -- easiest way to do this was for me to play with it a bit, so here's a branch off your branch that drops one of the layers: #65808
We can merge yours as-is and do this as a follow-up, or merge this into your branch (and take it for another spin if you do merge it in, I just tested quickly 😅 ) Thanks for doing this!
|
|
||
| evaluation = JSON.parse(response.body)['choices'][0]['message']['content'] | ||
| unless VALID_EVALUATION_RESPONSES_SIMPLE.include?(evaluation) | ||
| report_openai_safety_check("InvalidResponse") |
There was a problem hiding this comment.
Could we use a different message to differentiate this response (expected sometimes) from the errors below (expected almost never)?
There was a problem hiding this comment.
+1, or alternatively should we also pass through the attempt # we're on so we can use that as a dimension value?
| } | ||
|
|
||
| AichatOpenaiHelper.request_structured_chat_completion( | ||
| [ |
There was a problem hiding this comment.
Is this the same as what's returned by safety_check_messages?
| body = JSON.parse(http_response.body) | ||
| raise StandardError.new(body['error']) if body['error'] | ||
| # Check if the returned content is a json, else throw error | ||
| response_json = body.dig("choices", 0, "message", "content") |
There was a problem hiding this comment.
Is this error checking different than what we're doing in request_chat_completion above?
There was a problem hiding this comment.
+1, could this be consolidated with the above method? and/or common code moved to a helper? Looks like the key differences here are 1) passing in the response format and 2) parsing the JSON response?
| # Retry only on network-related exceptions | ||
| response = Retryable.retryable( | ||
| tries: 2, | ||
| on: [Net::OpenTimeout, Net::ReadTimeout, SocketError, Errno::ECONNRESET] |
There was a problem hiding this comment.
Does this error list come from somewhere (eg, errors we've seen in HB?), or just theorized network exceptions? FWIW still feels a little excessive to me, but not a huge deal. I think adding a more verbose comment on the reasoning though might be helpful for future readers.
|
|
||
| begin | ||
| parsed = JSON.parse(raw_content) | ||
| rescue JSON::ParserError |
There was a problem hiding this comment.
Why don't we just let Ruby throw this error (vs rescuing and raising a custom error)?
sanchitmalhotra126
left a comment
There was a problem hiding this comment.
Nice work, very cool to see this working end to end! I think there are some options for simplification and consolidation to make the code easier to read and follow, and +1 to Ben's suggestions and simplification PR. But overall this is looking good!
| messages = safety_check_messages(text, level_id) | ||
|
|
||
| # Retry only on network-related exceptions | ||
| response = Retryable.retryable( |
There was a problem hiding this comment.
Is it still worth ditching Retryable and letting any uncaught exceptions after the second round bubble up to the caller?
There was a problem hiding this comment.
I definitely thought about this. In some ways, I feel like the retryable clause ends up being cleaner since the worst case flow is 1st call network error -> 2nd attempt invalid response -> 3rd attempt structural and retryable feels like it has a much shorter/simpler syntax to capture what's going on with those first two attempts rather than a tree of if/else.
If we'd prefer being more explicit with the if/else cases, though, I think it totally makes sense to switch to that and it relies less on a library.
|
|
||
| evaluation = JSON.parse(response.body)['choices'][0]['message']['content'] | ||
| unless VALID_EVALUATION_RESPONSES_SIMPLE.include?(evaluation) | ||
| report_openai_safety_check("InvalidResponse") |
There was a problem hiding this comment.
+1, or alternatively should we also pass through the attempt # we're on so we can use that as a dimension value?
| def request_chat_completion(messages, temperature = DEFAULT_TEMPERATURE) | ||
| # Extra empty set is included to provide generic coverage to new parameters that OpenAI adds in the future | ||
| # Examples include response_format for json response formatting and tools for function calling | ||
| def request_chat_completion(messages, temperature = DEFAULT_TEMPERATURE, extra: {}) |
There was a problem hiding this comment.
Could this generic extra param just be an optional request_parameters so you don't have to merge later on?
| body = JSON.parse(http_response.body) | ||
| raise StandardError.new(body['error']) if body['error'] | ||
| # Check if the returned content is a json, else throw error | ||
| response_json = body.dig("choices", 0, "message", "content") |
There was a problem hiding this comment.
+1, could this be consolidated with the above method? and/or common code moved to a helper? Looks like the key differences here are 1) passing in the response format and 2) parsing the JSON response?
| rescue JSON::ParserError | ||
| raise StandardError.new("Unexpected JSON response, got: #{response_json}") | ||
| end | ||
| http_response |
There was a problem hiding this comment.
Why are we returning the whole http_response here? Would it be simpler to just return the parsed JSON since we're already checking that it's valid?
There was a problem hiding this comment.
I think I was being overly cautious; I had built up a lot of edge case handling around the structured responses (e.g. if it still gives a malformed response or returns empty) since while OpenAI advertises structured responses as being much more fixed than typical responses, they still included a caveated in their documentation that it could still possibly ignore the instructions. This means there might (very) occasionally be times where you have to fallback onto the body content instead of checking the intended json output.
However, I think you are correct in that we could totally just be handling all that logic within this function instead. I think the only logical reason why we would return the full response is to be able to log the exact kind of error (e.g. malformed structured response, empty response, response inside content body) but, on thinking about it further, it doesn't really make sense to bother logging that since we wouldn't be able to meaningfully action on those errors anyways.
Anyways TL;DR: I basically dunno after the weekend LOL. But I think I was probably being overly cautious and trying to include error messages for very specific edge cases.
| end | ||
| raise "OpenAI request failed with status #{response.code}: #{response.body}" unless response.success? | ||
|
|
||
| evaluation = JSON.parse(response.body)['choices'][0]['message']['content'] |
There was a problem hiding this comment.
Also not your change - but I just realized that this helper calls the OpenaiChatHelper directly rather than the AichatSafetyHelper, of which the latter already does some work to unwrap and verify this response. It might be worth using AichatSafetyHelper#request_chat_completion to further simplify some of this (and that way it's analogous to AichatSafetyHelper#request_structured_chat_completion, or better yet if those two functions can be combined, then we're calling the same function in both cases).
|
|
||
| # Makes a structured output request to GPT for moderation classification. | ||
| private def request_structured_safety_check(text, level_id) | ||
| safety_json_schema = { |
There was a problem hiding this comment.
nit: might be nice to move this to a constant
…utput-updates Incorporated feedback to remove extraneous function/call by instead directly invoking the main function with a clarified optional parameter options

Changes Summary
Files Changed
Testing Notes
Current Production Build:
Local Testing