OpenAIJob
meganno_client.llm_jobs.OpenAIJob
The OpenAIJob class handles calls to OpenAI APIs.
__init__(label_schema={}, label_names=[], records=[], model_config={}, prompt_template=None)
Init function
Parameters:
Name | Type | Description | Default |
---|---|---|---|
label_schema |
list
|
List of label objects |
{}
|
label_names |
list
|
List of label names to be used for annotation |
[]
|
records |
list
|
List of records in [{'data': , 'uuid': }] format |
[]
|
model_config |
dict
|
Parameters for the Open AI model |
{}
|
prompt_template |
str
|
Template based on which prompt to OpenAI is prepared for each record |
None
|
set_openai_api_key(openai_api_key, openai_organization)
Set the API keys necessary for call to OpenAI API
Parameters:
Name | Type | Description | Default |
---|---|---|---|
openai_api_key |
str
|
OpenAI API key provided by user |
required |
openai_organization |
str[optional]
|
OpenAI organization key provided by user |
required |
validate_openai_api_key(openai_api_key, openai_organization)
staticmethod
Validate the OpenAI API and organization keys provided by user
Parameters:
Name | Type | Description | Default |
---|---|---|---|
openai_api_key |
str
|
OpenAI API key provided by user |
required |
openai_organization |
str[optional]
|
OpenAI organization key provided by user |
required |
Raises:
Type | Description |
---|---|
Exception
|
If api keys provided by user are invalid, or if any error in calling OpenAI API |
Returns:
Name | Type | Description |
---|---|---|
openai_api_key |
str
|
OpenAI API key |
openai_organization |
str
|
OpenAI Organization key |
validate_model_config(model_config, api_name='chat')
staticmethod
Validate the LLM model config provided by user. Model should be among the models allowed on MEGAnno, and the parameters should match format specified by Open AI
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_config |
dict
|
Model specifications such as model name, other parameters eg. temperature, as provided by user |
required |
api_name |
str
|
Name of OpenAI api eg. "chat" or "completion |
'chat'
|
Raises:
Type | Description |
---|---|
Exception
|
If model is not among the ones provided by MEGAnno, or if configuration format is incorrect |
Returns:
Name | Type | Description |
---|---|---|
model_config |
dict
|
Model congigurations |
is_valid_prompt(prompt)
Validate the prompt generated. It should not exceed the maximum token limit specified by OpenAI. We use the approximation 1 word ~ 1.33 tokens
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt |
str
|
Prompt generated for OpenAI based on template and the record data |
required |
Returns:
Type | Description |
---|---|
bool
|
True if prompt is valid, False otherwise |
generate_prompts()
Helper function. Given a prompt template and a list of records, generate a list of prompts for each record
Returns:
Name | Type | Description |
---|---|---|
prompts |
list
|
List of tuples of (uuid, generated prompt) for each record in given subset |
get_response_length()
Return the length of the openai response
get_openai_conf_score()
Return confidence score of the label, calculated using average of logit scores
preprocess()
Generate the list of prompts for each record based on the subset and template
Returns:
Name | Type | Description |
---|---|---|
prompts |
list
|
List of prompts |
get_llm_annotations(batch_size=1, num_retrials=2, api_name='chat', label_meta_names=[])
Call OpenAI using the generated prompts, to obtain valid & invalid responses
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size |
int
|
Size of batch to each Open AI prompt |
1
|
num_retrials |
int
|
Number of retrials to OpenAI in case of failure in response |
2
|
api_name |
str
|
Name of OpenAI api eg. "chat" or "completion |
'chat'
|
label_meta_names |
list of label metadata names to be set |
[]
|
Returns:
Name | Type | Description |
---|---|---|
responses |
list
|
List of valid responses from OpenAI |
invalid_responses |
list
|
List of invalid responses from OpenAI |
extract(uuid, response, fuzzy_extraction)
Helper function for post-processing. Extract the label (name and value) from the OpenAI response
Parameters:
Name | Type | Description | Default |
---|---|---|---|
uuid |
str
|
Record uuid |
required |
response |
str
|
Output from OpenAI |
required |
fuzzy_extraction |
Set to True if fuzzy extraction desired in post processing |
required |
Returns:
Name | Type | Description |
---|---|---|
ret |
dict
|
Returns the label name and label value |
post_process_annotations(fuzzy_extraction=False)
Perform output extraction from the responses generated by LLM, and formats it according to MEGAnno data model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fuzzy_extraction |
Set to True if fuzzy extraction desired in post processing |
False
|
Returns:
Name | Type | Description |
---|---|---|
annotations |
list
|
List of annotations (uuid, label) in format required by MEGAnno |