Chat models take a list of messages as input and return a model-generated message as output. Although the chat format is designed to make multi-turn conversations easy, it’s just as useful for single-turn tasks without any conversation.
An example Chat Completions API call looks like the following using Langchain:
from langchain.schema import HumanMessage, SystemMessage, AIMessage
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(temperature=0,
openai_api_base="https://api.blockentropy.ai/v1",
openai_api_key="be_...",
model="be-120b-tessxl",
streaming=True,
max_tokens=1024)
messages = [
SystemMessage(
content="You are a helpful assistant."
),
HumanMessage(
content="Write a blog post about large language models."
)
]
for chunk in llm.stream(messages):
print(chunk.content, end="", flush=True)