You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Need to fix some example evals like : FAILED examples/triage_agent/evals.py::test_conversation_is_successful[messages0] - assert False == True or this one:
___________________________________________________________________________________________________ test_does_not_call_weather_when_not_asked[Hi!] ____________________________________________________________________________________________________
query = 'Hi!'
@pytest.mark.parametrize(
"query",
[
"Who's the president of the United States?",
"What is the time right now?",
"Hi!",
],
)
def test_does_not_call_weather_when_not_asked(query):
tool_calls = run_and_get_tool_calls(weather_agent, query)
> assert not tool_calls
E assert not [{'function': {'arguments': '{"location": "New York", "time": "now"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
examples/weather_agent/evals.py:44: AssertionError
==== short test summary info =====
FAILED examples/weather_agent/evals.py::test_does_not_call_weather_when_not_asked[Who's the president of the United States?] - assert not [{'function': {'arguments': '{"location": "United States"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
FAILED examples/weather_agent/evals.py::test_does_not_call_weather_when_not_asked[What is the time right now?] - assert not [{'function': {'arguments': '{"location": "", "time": "now"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
FAILED examples/weather_agent/evals.py::test_does_not_call_weather_when_not_asked[Hi!] - assert not [{'function': {'arguments': '{"location": "New York", "time": "now"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
====== 3 failed, 3 passed in 3.10s ======
The text was updated successfully, but these errors were encountered:
Need to fix some example evals like :
FAILED examples/triage_agent/evals.py::test_conversation_is_successful[messages0] - assert False == True
or this one:The text was updated successfully, but these errors were encountered: